Learning from data

Course syllabus adopted 2021-02-26 by Head of Programme (or corresponding).

Overview

Swedish nameBayesiansk dataanalys och maskininlärning
CodeTIF285
Credits7.5 Credits
OwnerMPPHS
Education cycleSecond-cycle
Main field of studyEngineering Physics
DepartmentPHYSICS
GradingTH - Pass with distinction (5), Pass with credit (4), Pass (3), Fail

Course round 1

Teaching language English
Application code 85116
Maximum participants60
Block schedule
Open for exchange studentsYes

Credit distribution

Module	Sp1	Sp2	Sp3	Sp4	Summer	Not Sp	Examination dates
0119 Project 7.5 c Grading: TH	7.5 c

In programmes

MPPHS - PHYSICS, MSC PROGR, Year 1 (compulsory)

Examiner

Christian Forssén
Full Professor, Subatomic, High Energy and Plasma Physics, Physics
Contact

Go to coursepage

Eligibility

General entry requirements for Master's level (second cycle)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Specific entry requirements

English 6 (or by other approved means with the equivalent proficiency level)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Course specific prerequisites

- Solid background in undergraduate mathematics (Multivariable analysis, linear algebra, mathematical statistics) - Basic programming skills (The Python programming language will be introduced and used throughout the course. Experience from using Matlab or similar interpreted languages is sufficient.) - General physics knowledge (Undergraduate-level, introductory physics helps to understand the context of the scientific examples that will be used.)

Aim

The course introduces a variety of central algorithms and methods essential for performing scientific data analysis using statistical inference and machine learning. Much emphasis is put on practical applications of Bayesian inference in the natural and engineering sciences, i.e. the ability to quantify the strength of inductive inference from facts (such as experimental data) to propositions such as scientific hypotheses and models.
The course is project-based, and the students will be exposed to fundamental research problems through the various projects, with the aim to reproduce state-of-the-art scientific results. The students will use the Python programming language, with relevant open-source libraries, and will learn to develop and structure computer codes for scientific data analysis projects.

Learning outcomes (after completion of the course the student should be able to)

- integrate knowledge of common statistical distributions and central concepts in Bayesian statistics into the analysis of scientific data;

- explain central aspects of Monte Carlo methods and Markov chains, and numerically apply these methods to sample multivariate probability densities;

- quantify and critically assess uncertainties of model parameters that are statistically inferred from scientific data; perform model comparison using a Bayesian viewpoint;

- understand and numerically implement several basic algorithms used in data analysis and machine learning such as linear methods for regression and classification, simple neural networks and gaussian processes;

- use python to perform scientific data analysis using statistical inference and machine learning and to visualize numerical results.

- write well-structured technical reports where results and conclusions from a scientific data analysis are communicated in a clear way.

- maintain a scientific and ethical conduct in the process of analyzing data and writing computer programs.

Content

The course has two central parts

1. Bayesian inference and data analysis.

2. Machine learning methods for data analysis.

The following subtopics will be covered

- Basic concepts from statistics, expectation values, variance, covariance, correlation functions and errors; discrete versus continuous probability distributions;

- Review of simple statistics models, binomial distribution, the Poisson distribution, simple and multivariate normal distributions;

- Central elements of Bayesian statistics and modeling;

- Monte Carlo methods, Markov chains, Metropolis-Hastings algorithm;

- Linear methods for regression and classification;

- Gaussian and Dirichlet processes;

- Neural networks;

Organisation

Lectures

Supervised computational exercises (group work on numerical projects)

Selected number of small analytical and numerical homework exercises

Two computational projects with written reports.

Literature

Lecture notes will be made available.

Recommended textbook:

Phil Gregory, Bayesian Logical Data Analysis for the Physical Sciences, Cambridge University Press, 2010

Additional reading:

David J.C. MacKay, Information Theory, Inference, and Learning Algorithms, Cambridge University Press, 4th printing, 2005

Trevor Hastie, Robert Tibshirani, Jerome H. Friedman, The Elements of Statistical Learning, Springer, 2nd edition, 2009

Andrew Gelman et al, Bayesian Data Analysis, CRC Press, 3rd edition, 2014

Aurelien Geron, Hands‑On Machine Learning with Scikit‑Learn and TensorFlow, O'Reilly, 1st edition, 2017

Examination including compulsory elements

The final grade is based on the performance on homework assignments and the graded numerical projects.

The course examiner may assess individual students in other ways than what is stated above if there are special reasons for doing so, for example if a student has a decision from Chalmers about disability study support.

Course syllabus for Learning from data

Overview

Course round 1

Credit distribution

In programmes

Examiner

Eligibility

Specific entry requirements

Course specific prerequisites

Aim

Learning outcomes (after completion of the course the student should be able to)

Content

Organisation

Literature

Examination including compulsory elements

Overview