ST451      Half Unit
Bayesian Machine Learning

This information is for the 2020/21 session.

Teacher responsible

Dr Konstantinos Kalogeropoulos


This course is available on the MSc in Applied Social Data Science, MSc in Data Science, MSc in Quantitative Methods for Risk Management, MSc in Statistics, MSc in Statistics (Financial Statistics), MSc in Statistics (Financial Statistics) (LSE and Fudan), MSc in Statistics (Financial Statistics) (Research), MSc in Statistics (Research), MSc in Statistics (Social Statistics) and MSc in Statistics (Social Statistics) (Research). This course is available as an outside option to students on other programmes where regulations permit.

Priority is given to Department of Statistics students and those with the course listed in their programme regulations.


Basic knowledge in probability and first course in statistics such as ST202 or equivalent Probability Distribution Theory and Inference; basic knowledge of the principles of computer programming is sufficient (e.g. in any of Python, R, Matlab, C, Java).This is desired rather than essential.

Course content

The course sets up the foundations and covers the basic algorithms covered in probabilistic machine learning. Several techniques that are probabilistic in nature are introduced and standard topics are revisited from a Bayesian viewpoint. The module provides training in state-of-the-art methods that have been applied successfully for several tasks such as natural language processing, image recognition and fraud detection.

The first part of the module covers the basic concepts of Bayesian Inference such as prior and posterior distribution, Bayesian estimation,  model choice and forecasting. These concepts are also illustrated in real world applications modelled via linear models of regression and classification and compared with alternative approaches.

The second part of the module introduces and provides training in further topics of probabilistic machine learning such as Graphical models, mixtures and cluster analysis, Variational approximation, advanced Monte Carlo sampling methods, sequential data and Gaussian processes. All topics are illustrated via real-world examples and are contrasted against non-Bayesian approaches.


This course will be delivered through a combination of classes and lectures totalling a minimum of 35 hours across the Lent Term. This year, some or all of this teaching may be delivered through a combination of virtual classes and flipped-lectures delivered as short online videos. This course does not include a reading week and will be concluded by the end of week 10 of Lent Term.


  • Bayesian inference concepts: Prior and posterior distributions, Bayes estimators, credible inter- vals, Bayes factors, Bayesian forecasting, Posterior Predictive distribution.
  • Linear models for regression: Linear basis function models, Bayesian linear regression, Bayesian model comparison.
  • Linear models for classification: Probabilistic generative models, Probabilistic discriminative models, The Laplace approximation, Bayesian logistic regression.
  • Variational inference, Variational linear and logistic regression.
  • Graphical models: Bayesian networks, Conditional independence, Markov random fields.
  • Mixture models and Clustering: Clustering, Mixtures, The EM algorithm.
  • Sampling methods: Basic sampling algorithms, Markov chain Monte Carlo, Gibbs sampling
  • Sequential data: Markov models, Hidden Markov models, Linear dynamical systems.
  • Gaussian processes : Bayesian Non-Parametrics, Gaussian processes for regression and classifi- cation.

Formative coursework

Students will be expected to produce 10 problem sets in the LT.

10 problem sets in LT to prepare students for both summative assessment components. They will include theoretical exercises, targeting for learning outcomes a and b, as well as computer-based assignments (for learning outcome c) that will need to be presented in suitable form for the purposes of learning outcome d. Additionally, mostly related to learning outcome b, students will be encouraged to share and compare their responses in some challenging parts of the problem sets, through the use of dedicated Moodle forums.

Indicative reading

  • C. M. Bishop, Pattern Recognition and Machine Learning, Springer 2006
  • K. Murphy, Machine Learning: A Probabilistic Perspective, MIT Press, 2012
  • S. Rogers and M. Girolami, A First Course in Machine Learning, Second Edition, Chapman and Hall/CRC, 2016
  • D. J. C. MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press, 2003
  • D. Barber, Bayesian Reasoning and Machine Learning, Cambridge University Press 2012


Exam (50%, duration: 2 hours) in the summer exam period.
Project (50%) in the ST.

Important information in response to COVID-19

Please note that during 2020/21 academic year some variation to teaching and learning activities may be required to respond to changes in public health advice and/or to account for the situation of students in attendance on campus and those studying online during the early part of the academic year. For assessment, this may involve changes to mode of delivery and/or the format or weighting of assessments. Changes will only be made if required and students will be notified about any changes to teaching or assessment plans at the earliest opportunity.

Key facts

Department: Statistics

Total students 2019/20: 57

Average class size 2019/20: 19

Controlled access 2019/20: Yes

Value: Half Unit

Guidelines for interpreting course guide information

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills