ST510 Foundations of Machine Learning

ST510 Half Unit
Foundations of Machine Learning

This information is for the 2023/24 session.

Teacher responsible

Dr Chengchun Shi, COL.8.08

Availability

This course is available on the MPhil/PhD in Statistics. This course is available with permission as an outside option to students on other programmes where regulations permit.

The availability as an outside option requires a demonstration of sufficient background in mathematics and statistics and is at the discretion of the instructor.

Pre-requisites

A knowledge of probability and statistical theory to the level of ST102 and ST206 and some parts of ST505 (e.g. linear models and generalized linear models). Some experience with computer programming will be assumed (e.g., Python, R).

Course content

The goal of this course is to provide students with a training in foundations of machine learning with a focus on statistical and algorithmic aspects. Students will learn fundamental statistical principles, algorithms, and how to implement and apply machine learning algorithms using the state-of-the-art Python packages such as scikit-learn, TensorFlow, and OpenAI Gym.

The course will cover the following topics:

Foundations of supervised learning: empirical risk minimisation, empirical minimisation with inductive bias, PAC learning, learning via uniform convergence
Convex optimisation: convexity, Newton-Raphson, gradient descent, stochastic gradient descent (SGD), acceleration by momentum, smoothness, strong convexity, convergence rates, alternating direction method of multipliers
Non-convex optimisation: EM algorithm, MCMC, variational Bayesian inference, optimisation landscape, local minima and saddle points
Support vector machines: margin and hard-SVM, soft-SVM and norm regularization, optimality conditions and support vectors, implementing soft-SVM using SGD
Decision trees and random forests: sample complexity, decision tree algorithms, random forests
Neural networks: feedforward neural networks, expressive power of neural networks, stochastic gradient descent and backpropagation
Unsupervised learning - clustering: linkage-based clustering algorithms, k-means and other cost minimisation clustering, spectral clustering, information bottleneck
Unsupervised learning - dimension reduction: PCA, matrix completion, autoencoder
Online learning and optimisation: online learnability, online classification, weighted majority, online convex optimization, regret minimisation
Reinforcement learning: multi-armed bandit processes, reinforcement learning problem, Markov Decision Problem, reinforcement learning solution methods

Teaching

This course will be delivered through a combination of classes, lectures and Q&A sessions totalling a minimum of 35 hours in Winter Term. This course includes a reading week in Week 6 of Winter Term.

Formative coursework

Students will be expected to produce 9 problem sets in the WT.

Weekly problem sets that are discussed in subsequent seminars. The coursework that will be used for summative assessment will be chosen from a subset of these problems.

Indicative reading

Avrim Blum, John Hopcroft and Ravindran Kannan, Foundations of Data Science, Cambridge University Press, 2020; text here https://www.cs.cornell.edu/jeh/book.pdf
Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, 2004; text here http://web.stanford.edu/~boyd/cvxbook/
Sebastien Bubeck, Convex optimization: algorithms and complexity, Now Publishers Inc. 2016; text here http://sbubeck.com/Bubeck15.pdf
Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, The MIT Press, 2016
Aston Zhang, Zack C. Lipton, Mu Li, and Alex J. Smola, Deep Dive into Deep Learning, 2020; text here https://d2l.ai/
Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition, Springer, 2017
Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: from Theory to Algorithms, Cambridge University Press, 2014; text here https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf
Richard S. Sutton and Andrew G. Barto, Reinforcement Learning: An Introduction, Second Edition, MIT Press, Cambridge, MA, 2018; text here http://www.incompleteideas.net/book/the-book-2nd.html

Assessment

Project (40%, 3000 words) and take-home assessment (20%) in the WT.
Oral examination (40%) in the ST.

The summative assessment will be based on four pieces of take-home assesment assignments (20% in total; 5% each), one project assignment (40%), and one oral exam (40%).

For the take-home assesments, students will be given homework problem sets and computer programming exercises in weeks 2, 4, 7, and 9.

The project assesment will be in April. Students will be asked to submit ther project reports within one month.

Key facts

Department: Statistics

Total students 2022/23: 9

Average class size 2022/23: 9

Value: Half Unit

Guidelines for interpreting course guide information

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Personal development skills

Self-management
Problem solving
Application of information skills
Communication
Application of numeracy skills
Commercial awareness
Specialist skills

ST510 Half UnitFoundations of Machine Learning

ST510 Half Unit
Foundations of Machine Learning