Course syllabus for Probability and statistical learning using Python

Course syllabus adopted 2022-02-02 by Head of Programme (or corresponding).

Overview

  • Swedish nameSannolikhetsteori och statistisk inlärning med Python
  • CodeMVE137
  • Credits7.5 Credits
  • OwnerMPICT
  • Education cycleSecond-cycle
  • Main field of studyElectrical Engineering, Mathematics
  • DepartmentELECTRICAL ENGINEERING
  • GradingTH - Pass with distinction (5), Pass with credit (4), Pass (3), Fail

Course round 1

  • Teaching language English
  • Application code 13112
  • Block schedule
  • Open for exchange studentsYes

Credit distribution

0121 Examination 7.5 c
Grading: TH
7.5 c
  • 28 Okt 2022 am J
  • 03 Jan 2023 pm J
  • 24 Aug 2023 pm J

In programmes

Examiner

Go to coursepage (Opens in new tab)

Eligibility

General entry requirements for Master's level (second cycle)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Specific entry requirements

English 6 (or by other approved means with the equivalent proficiency level)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Course specific prerequisites

Bachelor-level knowledge of probability and Python. For the students with no prior knowledge of Python, pointers to tutorials will be provided.

Aim

The course will provide the participants with a solid foundation of probability theory and statistical learning. In particular, in this course the participants will become familiar with key probabilistic and statistical concepts in data science and will learn how to apply them to analyze data sets and draw meaningful conclusions from data. The course will cover both theoretical and practical aspects, with the objective of preparing the participants to apply the acquired knowledge to the real world. The participants will have the possibility to experiment and practice with the concepts taught in the course via Python programs and the Jupyter Notebook platform.

Learning outcomes (after completion of the course the student should be able to)

  • Explain probability concepts such as tail probability bounds, moment-generating functions and their applications, Markov chains, and central limit theorems
  • Explain statistical models and methods that are used for prediction in science and technology, such as regression- and classification-type statistical models.
  • Select suitable statistical models to analyze existing data sets, apply sound statistical methods, and perform analyses using Python.
  • Discuss the use of common Python libraries such as numpy, matplotlib, jupyter notebook, pandas, to perform data analysis
  • Design Python-programs that apply the probability and statistical learning concepts presented in the class, to draw meaningful conclusions from data.

Content

Fundamentals of probability theory

Discrete random variables and expectation

  • Axioms of probability
  • Random variables and expectation
  • Bernoulli and Binomial random variables
  • Conditional expectation
  • The geometric distribution
Moment and deviations
  • Markov inequality
  • Variance and moments of a random variable
  • Chebyshev's inequality
  • Chernoff and Hoeffding 
Markov chains and random walks
  • Markov chains: definitions and representations
  • Classification of states
  • Stationary distribution
  • Random walks
Continuous distribution and the Poisson process
  • Continuous random variables
  • The uniform distribution
  • The exponential distribution
  • The Poisson process
  • Continuous time Markov processes
  • Markovian queues
Fundamentals of statistical learning

Overview of supervised learning

  • Least square and nearest neighbor
  • Statistical decision theory
  • Local methods in high dimensions
  • Statistical model supervised learning and function approximation
  • Structured regression model
  • Classes of restricted estimators
Linear methods for regression
  • Linear regression models and least square
  • Subset selection
  • Shrinkage methods
  • Methods using derived input directions
Linear methods for classification
  • Linear discriminant analysis
  • Logistic regression
  • Separating hyperplanes
Model assessment and selection
  • Bias, variance and model complexity
  • The bias-variance decomposition
  • Effective number of parameters
  • Bayesian approach and BIC
  • Cross-validations
  • Bootstrap methods

Organisation

Lectures and problem solving exercise sessions. 

Literature

  • G. R. Grimmett and D. R. Stirzaker, Probability and Random Processes, 3rd ed. Oxford, U.K.: Oxford Univ. Press, 2001.
  • Michael Mitzenmacher and Eli Upfal, Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis, Cambridge University Press, 2017.
  • T. Hastie, R. Tibshirani, and J. Friedma, The elements of statistical learning: Data minining, inference, and prediction, 2nd ed. Springer, 2008

Examination including compulsory elements

The final grade will be based on the points collected in the weekly homework assignments and Python labs as well as in the written exam.

The course examiner may assess individual students in other ways than what is stated above if there are special reasons for doing so, for example if a student has a decision from Chalmers on educational support due to disability.