Probability and statistical learning using Python

Course syllabus adopted 2022-02-02 by Head of Programme (or corresponding).

Overview

Swedish nameSannolikhetsteori och statistisk inlärning med Python
CodeMVE137
Credits7.5 Credits
OwnerMPICT
Education cycleSecond-cycle
Main field of studyElectrical Engineering, Mathematics
DepartmentELECTRICAL ENGINEERING
GradingTH - Pass with distinction (5), Pass with credit (4), Pass (3), Fail

Course round 1

Teaching language English
Application code 13114
Block schedule
Open for exchange studentsYes

Credit distribution

Module	Sp1	Sp2	Sp3	Sp4	Summer	Not Sp	Examination dates
0121 Examination 7.5 c Grading: TH	7.5 c						01 Nov 2024 am J 07 Jan 2025 pm J 28 Aug 2025 pm J

In programmes

Examiner

To personal page
Giuseppe Durisi
Full Professor, Communication, Antennas and Optical Networks, Electrical Engineering
Contact

Go to coursepage

Eligibility

General entry requirements for Master's level (second cycle)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Specific entry requirements

English 6 (or by other approved means with the equivalent proficiency level)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Course specific prerequisites

Bachelor-level knowledge of probability and Python. For the students with no prior knowledge of Python, pointers to tutorials will be provided.

Aim

The course will provide the participants with a solid foundation of probability theory and statistical learning. In particular, in this course the participants will become familiar with key probabilistic and statistical concepts in data science and will learn how to apply them to analyze data sets and draw meaningful conclusions from data. The course will cover both theoretical and practical aspects, with the objective of preparing the participants to apply the acquired knowledge to the real world. The participants will have the possibility to experiment and practice with the concepts taught in the course via Python programs and the Jupyter Notebook platform.

Learning outcomes (after completion of the course the student should be able to)

Explain probability concepts such as tail probability bounds, moment-generating functions and their applications, Markov chains, and central limit theorems
Explain statistical models and methods that are used for prediction in science and technology, such as regression- and classification-type statistical models.
Select suitable statistical models to analyze existing data sets, apply sound statistical methods, and perform analyses using Python.
Discuss the use of common Python libraries such as numpy, matplotlib, jupyter notebook, pandas, to perform data analysis
Design Python-programs that apply the probability and statistical learning concepts presented in the class, to draw meaningful conclusions from data.

Content

Fundamentals of probability theory

Discrete random variables and expectation

Axioms of probability
Random variables and expectation
Bernoulli and Binomial random variables
Conditional expectation
The geometric distribution

Moment and deviations

Markov inequality
Variance and moments of a random variable
Chebyshev's inequality
Chernoff and Hoeffding

Markov chains and random walks

Markov chains: definitions and representations
Classification of states
Stationary distribution
Random walks

Continuous distribution and the Poisson process

Continuous random variables
The uniform distribution
The exponential distribution
The Poisson process
Continuous time Markov processes
Markovian queues

Fundamentals of statistical learning

Overview of supervised learning

Least square and nearest neighbor
Statistical decision theory
Local methods in high dimensions
Statistical model supervised learning and function approximation
Structured regression model
Classes of restricted estimators

Linear methods for regression

Linear regression models and least square
Subset selection
Shrinkage methods
Methods using derived input directions

Linear methods for classification

Linear discriminant analysis
Logistic regression
Separating hyperplanes

Model assessment and selection

Bias, variance and model complexity
The bias-variance decomposition
Effective number of parameters
Bayesian approach and BIC
Cross-validations
Bootstrap methods

Organisation

Lectures and problem solving exercise sessions.

Literature

G. R. Grimmett and D. R. Stirzaker, Probability and Random Processes, 3rd ed. Oxford, U.K.: Oxford Univ. Press, 2001.
Michael Mitzenmacher and Eli Upfal, Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis, Cambridge University Press, 2017.
T. Hastie, R. Tibshirani, and J. Friedma, The elements of statistical learning: Data minining, inference, and prediction, 2nd ed. Springer, 2008

Examination including compulsory elements

The final grade will be based on the points collected in the weekly homework assignments and Python labs as well as in the written exam.

The course examiner may assess individual students in other ways than what is stated above if there are special reasons for doing so, for example if a student has a decision from Chalmers about disability study support.

Course syllabus for Probability and statistical learning using Python

Overview

Course round 1

Credit distribution

In programmes

Examiner

Eligibility

Specific entry requirements

Course specific prerequisites

Aim

Learning outcomes (after completion of the course the student should be able to)

Content

Organisation

Literature

Examination including compulsory elements

Overview