Course syllabus for Structured machine learning

Course syllabus adopted 2025-02-03 by Head of Programme (or corresponding).

Overview

  • Swedish nameStrukturerad maskininlärning
  • CodeDAT625
  • Credits7.5 Credits
  • OwnerMPDSC
  • Education cycleSecond-cycle
  • Main field of studyComputer Science and Engineering, Software Engineering
  • DepartmentCOMPUTER SCIENCE AND ENGINEERING
  • GradingTH - Pass with distinction (5), Pass with credit (4), Pass (3), Fail

Course round 1

  • Teaching language English
  • Application code 87132
  • Maximum participants25 (at least 10% of the seats are reserved for exchange students)
  • Minimum participants5
  • Open for exchange studentsYes

Credit distribution

0124 Written and oral assignments 7.5 c
Grading: TH
7.5 c

In programmes

Examiner

Go to coursepage (Opens in new tab)

Eligibility

General entry requirements for Master's level (second cycle)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Specific entry requirements

English 6 (or by other approved means with the equivalent proficiency level)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Course specific prerequisites

Experience of Python programming is essential, prior experience with a modern machine learning library is highly recommended.

Knowledge equivalent to at least three of the following course's content:
Linear algebra (7.5 credits), numerical mathematics or scientific computing (7.5 credits), calculus (7.5 credits), or statistical mechanics/thermodynamics (7.5 credits).
 
One course in computational or mathematical statistics (7.5 credits) and one in machine learning (7.5 credits).

Students are recommended to take the course in second year of their master's program.  

Aim

The aim of the course is to familiarize students with the use of structure of data, and data generating processes, and how this information can be used to inform machine learning architecture design and training.
The course focuses on building a strong understanding of the underlying concepts and applying them in a practical setting. There will be a particular focus on applications in natural sciences.

Learning outcomes (after completion of the course the student should be able to)


Knowledge and understanding

  • Define core concepts such as geometric priors, equivariance, and probability flow generative models.
  • Identify key structural properties (e.g., symmetries, invariances) in data and their domains.
  • Explain how geometric deep learning (GDL) architectures (e.g., graph neural networks [GNN]) encode domain-specific structure.
  • Summarize the mathematical principles underlying probability flow generative models, such as measure transport and change of variables.
  • Interpret the role of inductive biases (e.g., locality, symmetry) in model design for scientific applications.

Skills and abilities

  • Implement geometric deep learning architectures (e.g., GNNs) using frameworks like PyTorch Geometric.
  • Implement and simulate probability flow generative models (e.g., diffusion models or continuous normalizing flows) subject to certain domain symmetries.
  • Compare the trade-offs between equivariant architectures (e.g., invariant vs. equivariant layers) in terms of expressivity, computational cost, and sample complexity.

Judgment and approach

  • Judge recent scientific reports on machine learning research projects using structure in data or data generating process.
  • Propose a small-scale research project that builds-on or applies upon themes covered in the course.
  • Appraise small-scale research projects within structured machine learning using structure in data or data generating process.

Content

This course will equip students with the theoretical foundations and practical tools to design machine learning systems that exploit structural patterns inherent in data domains, datasets, and generative processes. The course introduces geometric deep learning and probability flow generative models (e.g., continuous normalizing flows and diffusion models), and through practical exercises students will learn to build models that respect domain-specific constraints, such as invariance and equivariance to transformations or conservation laws.

 

The course bridges rigorous mathematical principles—such as geometric priors, stochastic differential equations, and symmetry-aware architectures—with hands-on implementation for real-world challenges. A key emphasis is placed on applications in natural sciences (e.g., molecular modeling), where structured data and generative processes are central to solving open problems.




The course will integrates theoretical lectures with assignments. The course will cover the following broad themes:


Geometric Deep Learning (GDL):
We will show how many modern Deep learning architectures, including Transformers, convolutional neural networks, and graph neural networks, emerge naturally from the GDL framework.
Key concepts:
- Learning in high-dimensional spaces, sample complexity, geometric priors, scale separation, symmetry, invariance/equivariance, deformation stability.

 

Data Generating processess and Probability flow generative models:
We discuss specific considerations about the data generation and acquisition/collection. We will give a brief introduction to generative modeling broadly and focus more specifically on diffusion models and continuous normalizing flows.

Key concepts:
- Direct/indirect observation, independent/correlated data, averaging/aliasing, density estimation, latent space, measure transport, push-forward, diffeomorphisms

 

Applications:

The course will focus on molecular applications as they provide a natural platform to illustrate all the concepts covered in the course.  Three projects will cover supervised  (classification, regression) and unsupervised learning (density estimation). The course ends with an essay assignment where a small-scale research project is proposed based on themes covered in the course.

 

 

This is not a model-zoo course: the course focuses on theoretical and conceptual understanding and how it relates practical implications. The course makes extensive use of high-dimensional calculus and linear algebra. The course also includes a crash-course in abstract algebra (group theory and group representation theory)

Organisation


Literature

Lecture notes/hand-outs. Primary literature.

Examination including compulsory elements

Hand-ins, take-home projects, peer-assessment and essay assignment.

The course examiner may assess individual students in other ways than what is stated above if there are special reasons for doing so, for example if a student has a decision from Chalmers about disability study support.