ST456      Half Unit
Deep Learning

This information is for the 2021/22 session.

Teacher responsible

Prof Milan Vojnovic COL5.05


This course is available on the MSc in Applicable Mathematics, MSc in Applied Social Data Science, MSc in Data Science, MSc in Geographic Data Science, MSc in Health Data Science, MSc in Management of Information Systems and Digital Innovation, MSc in Operations Research & Analytics, MSc in Quantitative Methods for Risk Management, MSc in Statistics, MSc in Statistics (Financial Statistics), MSc in Statistics (Financial Statistics) (LSE and Fudan), MSc in Statistics (Financial Statistics) (Research), MSc in Statistics (Research), MSc in Statistics (Social Statistics) and MSc in Statistics (Social Statistics) (Research). This course is available with permission as an outside option to students on other programmes where regulations permit.

MSc Data Science students will be given priority for enrollment in this course.


The course requires some mathematics, in particular some use of vectors and some calculus. Basic knowledge of computer programming is expected, mainly Python.

Course content

This course is about deep learning, covering fundamental concepts of deep learning and neural networks, design of neural network architectures, optimisation methods for training neural networks, and neural networks design for particular purposes such as image recognition, sequence modelling, natural language processing and generative models. The course will cover the following topics:

  1. Introduction – course overview
  2. Neural networks – single-layer networks, linear discriminant functions, perceptron, XOR problem, multi-layer perceptron, perceptron learning criteria, perceptron learning algorithm, feedforward neural network architecture
  3. Training neural networks I – empirical risk minimization, regularisation, gradient descent, stochastic gradient descent, mini-batch (stochastic) gradient descent, local and global extrema, overfitting, early stopping, convergence rates for smooth convex optimisation problems
  4. Training neural networks II – backpropagation algorithm, momentum acceleration methods, dropout, batch normalisation, modern optimization solvers, AdaGrad, RMSProp and Adam
  5. Convolutional neural networks (CNNs) – convolutional operations, convolutional neural network, pooling, parameter sharing, basic convolutional neural network architectures
  6. CNN architectures – convolutional neural network architectures, LeNets, AlexNets, VGGnets, Inception networks, and ResNets
  7. Sequence modelling: recurrent neural networks – recurrent neural network architectures, exploding/vanishing gradients, gated recurrent units, and long short-term memory units, bi-directional recurrent neural networks
  8. Sequence modelling: transformers – encoder-decoder architecture, attention mechanism, transformer architecture, neural machine translation
  9. Natural language processing  – word embedding methods, training methods, negative sampling, word2vec, GloVe, text classification
  10. Generative models – Boltzman machines, restricted Boltzman machines, deep belief networks, deep Boltzman machine, generative adversarial networks (GANs)


20 hours of lectures and 15 hours of classes in the LT.

This course will be delivered through a combination of classes, and lectures and Q&A sessions totalling a minimum of 35 hours across Lent Term. This year, some of this teaching may be delivered through a combination of virtual classes and flipped-lectures delivered as short online videos.

Formative coursework

Students will be expected to produce 8 problem sets in the LT.

Indicative reading

  • Ian Goodfellow, Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016,
  • Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola, Dive into Deep Learning,
  • TensorFlow – An end-to-end open source machine learning platform,


Project (80%), continuous assessment (10%) and continuous assessment (10%) in the LT.

Two of the problem sets submitted by students weekly will be assessed (20% in total). Each problem set will have an individual mark of 10% and submission will be required in LT Weeks 4 and 7. In addition, there will be a take-home exam (80%) in the form of a group project in which they will demonstrate their ability to develop and evaluate neural network algorithms for solving a prediction or classification task of their choice.

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Important information in response to COVID-19

Please note that during 2021/22 academic year some variation to teaching and learning activities may be required to respond to changes in public health advice and/or to account for the differing needs of students in attendance on campus and those who might be studying online. For example, this may involve changes to the mode of teaching delivery and/or the format or weighting of assessments. Changes will only be made if required and students will be notified about any changes to teaching or assessment plans at the earliest opportunity.

Key facts

Department: Statistics

Total students 2020/21: Unavailable

Average class size 2020/21: Unavailable

Controlled access 2020/21: No

Value: Half Unit

Guidelines for interpreting course guide information

Personal development skills

  • Self-management
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Commercial awareness
  • Specialist skills