Structured machine learning

Course syllabus adopted 2025-02-03 by Head of Programme (or corresponding).

Overview

Swedish nameStrukturerad maskininlärning
CodeDAT625
Credits7.5 Credits
OwnerMPDSC
Education cycleSecond-cycle
Main field of studyComputer Science and Engineering, Software Engineering
DepartmentCOMPUTER SCIENCE AND ENGINEERING
GradingTH - Pass with distinction (5), Pass with credit (4), Pass (3), Fail

Course round 1

Teaching language English
Application code 87132
Maximum participants25 (at least 10% of the seats are reserved for exchange students)
Minimum participants5
Block schedule
Open for exchange studentsYes

Credit distribution

Module	Sp1	Sp2	Sp3	Sp4	Summer	Not Sp	Examination dates
0124 Written and oral assignments 7.5 c Grading: TH	7.5 c

In programmes

Examiner

To personal page
Simon Olsson
Associate Professor, Data Science and AI, Computer Science and Engineering
Contact
- simonols@chalmers.se
- To personal page

Go to coursepage

Eligibility

General entry requirements for Master's level (second cycle)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Specific entry requirements

English 6 (or by other approved means with the equivalent proficiency level)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Course specific prerequisites

Experience of Python programming is essential, prior experience with a modern machine learning library is highly recommended.

Knowledge equivalent to at least three of the following course's content:
Linear algebra (7.5 credits), numerical mathematics or scientific computing (7.5 credits), calculus (7.5 credits), or statistical mechanics/thermodynamics (7.5 credits).

One course in computational or mathematical statistics (7.5 credits) and one in machine learning (7.5 credits).

Students are recommended to take the course in second year of their master's program.

Aim

The aim of the course is to familiarize students with the use of structure of data, and data generating processes, and how this information can be used to inform machine learning architecture design and training.
The course focuses on building a strong understanding of the underlying concepts and applying them in a practical setting. There will be a particular focus on applications in natural sciences.

Learning outcomes (after completion of the course the student should be able to)

Knowledge and understanding

Define core concepts such as geometric priors, equivariance, and probability flow generative models.
Identify key structural properties (e.g., symmetries, invariances) in data and their domains.
Explain how geometric deep learning (GDL) architectures (e.g., graph neural networks [GNN]) encode domain-specific structure.
Summarize the mathematical principles underlying probability flow generative models, such as measure transport and change of variables.
Interpret the role of inductive biases (e.g., locality, symmetry) in model design for scientific applications.

Skills and abilities

Implement geometric deep learning architectures (e.g., GNNs) using frameworks like PyTorch Geometric.
Implement and simulate probability flow generative models (e.g., diffusion models or continuous normalizing flows) subject to certain domain symmetries.
Compare the trade-offs between equivariant architectures (e.g., invariant vs. equivariant layers) in terms of expressivity, computational cost, and sample complexity.

Judgment and approach

Judge recent scientific reports on machine learning research projects using structure in data or data generating process.
Propose a small-scale research project that builds-on or applies upon themes covered in the course.
Appraise small-scale research projects within structured machine learning using structure in data or data generating process.

Content

This course will equip students with the theoretical foundations and practical tools to design machine learning systems that exploit structural patterns inherent in data domains, datasets, and generative processes. The course introduces geometric deep learning and probability flow generative models (e.g., continuous normalizing flows and diffusion models), and through practical exercises students will learn to build models that respect domain-specific constraints, such as invariance and equivariance to transformations or conservation laws.

The course bridges rigorous mathematical principlessuch as geometric priors, stochastic differential equations, and symmetry-aware architectureswith hands-on implementation for real-world challenges. A key emphasis is placed on applications in natural sciences (e.g., molecular modeling), where structured data and generative processes are central to solving open problems.

The course will integrates theoretical lectures with assignments. The course will cover the following broad themes:

Geometric Deep Learning (GDL):
We will show how many modern Deep learning architectures, including Transformers, convolutional neural networks, and graph neural networks, emerge naturally from the GDL framework.
Key concepts:
- Learning in high-dimensional spaces, sample complexity, geometric priors, scale separation, symmetry, invariance/equivariance, deformation stability.

Data Generating processess and Probability flow generative models:
We discuss specific considerations about the data generation and acquisition/collection. We will give a brief introduction to generative modeling broadly and focus more specifically on diffusion models and continuous normalizing flows.

Key concepts:
- Direct/indirect observation, independent/correlated data, averaging/aliasing, density estimation, latent space, measure transport, push-forward, diffeomorphisms

Applications:

The course will focus on molecular applications as they provide a natural platform to illustrate all the concepts covered in the course. Three projects will cover supervised (classification, regression) and unsupervised learning (density estimation). The course ends with an essay assignment where a small-scale research project is proposed based on themes covered in the course.

This is not a model-zoo course: the course focuses on theoretical and conceptual understanding and how it relates practical implications. The course makes extensive use of high-dimensional calculus and linear algebra. The course also includes a crash-course in abstract algebra (group theory and group representation theory)

Organisation

Literature

Lecture notes/hand-outs. Primary literature.

Examination including compulsory elements

Hand-ins, take-home projects, peer-assessment and essay assignment.

The course examiner may assess individual students in other ways than what is stated above if there are special reasons for doing so, for example if a student has a decision from Chalmers about disability study support.

Course syllabus for Structured machine learning

Overview

Course round 1

Credit distribution

In programmes

Examiner

Eligibility

Specific entry requirements

Course specific prerequisites

Aim

Learning outcomes (after completion of the course the student should be able to)

Content

Organisation

Literature

Examination including compulsory elements

Overview