ST451      Half Unit
Bayesian Machine Learning

This information is for the 2019/20 session.

Teacher responsible

Dr Konstantinos Kalogeropoulos


This course is available on the MSc in Applied Social Data Science, MSc in Data Science, MSc in Quantitative Methods for Risk Management, MSc in Statistics, MSc in Statistics (Financial Statistics), MSc in Statistics (Financial Statistics) (LSE and Fudan), MSc in Statistics (Financial Statistics) (Research), MSc in Statistics (Research), MSc in Statistics (Social Statistics) and MSc in Statistics (Social Statistics) (Research). This course is available as an outside option to students on other programmes where regulations permit.


Basic knowledge in probability and first course in statistics such as ST202 or equivalent Probability Distribution Theory and Inference; basic knowledge of the principles of computer programming is sufficient (e.g. in any of Python, R, Matlab, C, Java).This is desired rather than essential.

Course content

The course sets up the foundations and covers the basic algorithms covered in probabilistic machine learning. Several techniques that are probabilistic in nature are introduced and standard topics are revisited from a Bayesian viewpoint. The module provides training in state-of-the-art methods that have been applied successfully for several tasks such as natural language processing, image recognition and fraud detection.

The first part of the module covers the basic concepts of Bayesian Inference such as prior and posterior distribution, Bayesian estimation,  model choice and forecasting. These concepts are also illustrated in real world applications modelled via linear models of regression and classification and compared with alternative approaches.

The second part of the module introduces and provides training in further topics of probabilistic machine learning such as Graphical models, mixtures and cluster analysis, Variational approximation, advanced Monte Carlo sampling methods, sequential data and Gaussian processes. All topics are illustrated via real-world examples and are contrasted against non-Bayesian approaches.


20 hours of lectures and 15 hours of computer workshops in the LT.


  • Bayesian inference concepts: Prior and posterior distributions, Bayes estimators, credible inter- vals, Bayes factors, Bayesian forecasting, Posterior Predictive distribution.
  • Linear models for regression: Linear basis function models, Bayesian linear regression, Bayesian model comparison.
  • Linear models for classification: Probabilistic generative models, Probabilistic discriminative models, The Laplace approximation, Bayesian logistic regression.
  • Variational inference, Variational linear and logistic regression.
  • Graphical models: Bayesian networks, Conditional independence, Markov random fields.
  • Mixture models and Clustering: Clustering, Mixtures, The EM algorithm.
  • Sampling methods: Basic sampling algorithms, Markov chain Monte Carlo, Gibbs sampling
  • Sequential data: Markov models, Hidden Markov models, Linear dynamical systems.
  • Gaussian processes : Bayesian Non-Parametrics, Gaussian processes for regression and classifi- cation.

Formative coursework

Students will be expected to produce 10 problem sets in the LT.

10 problem sets in LT to prepare students for both summative assessment components. They will include theoretical exercises, targeting for learning outcomes a and b, as well as computer-based assignments (for learning outcome c) that will need to be presented in suitable form for the purposes of learning outcome d. Additionally, mostly related to learning outcome b, students will be encouraged to share and compare their responses in some challenging parts of the problem sets, through the use of dedicated Moodle forums.

Indicative reading

  • C. M. Bishop, Pattern Recognition and Machine Learning, Springer 2006
  • K. Murphy, Machine Learning: A Probabilistic Perspective, MIT Press, 2012
  • S. Rogers and M. Girolami, A First Course in Machine Learning, Second Edition, Chapman and Hall/CRC, 2016
  • D. J. C. MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press, 2003
  • D. Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press 2012


Exam (50%, duration: 2 hours) in the summer exam period.
Project (50%) in the ST.

Key facts

Department: Statistics

Total students 2018/19: 22

Average class size 2018/19: 25

Controlled access 2018/19: No

Value: Half Unit

Guidelines for interpreting course guide information

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills