ST310 Half Unit
Machine Learning
This information is for the 2025/26 session.
Course Convenor
Prof Zoltan Szabo
Availability
This course is compulsory on the BSc in Data Science. This course is available on the BSc in Actuarial Science, BSc in Actuarial Science (with a Placement Year), BSc in Finance, BSc in Mathematics with Data Science, BSc in Mathematics with Economics, BSc in Mathematics, Statistics and Business, Erasmus Reciprocal Programme of Study and Exchange Programme for Students from University of California, Berkeley. This course is freely available as an outside option to students on other programmes where regulations permit. It does not require permission. This course is available with permission to General Course students.
This course is capped. Places will be assigned on a first come first served basis.
Requisites
Mutually exclusive courses:
This course cannot be taken with ST309 at any time on the same degree programme.
Pre-requisites:
Before taking this course, students must have completed: ST102 or (EC1C1 and ST109)
Additional requisites:
Students must have satisfied the pre-requisite requirements above, as well as a second-year course covering regression analysis.
Previous programming experience is not required but students who have no previous experience in R must complete an online pre-sessional R course from the Digital Skills Lab before the start of the course (https://moodle.lse.ac.uk/course/view.php?id=8714).
Course content
The primary focus of this course is on the core machine learning techniques in the context of high-dimensional or large datasets (i.e. big data). The first part of the course covers elementary and important statistical methods including nearest neighbours, linear regression, logistic regression, regularisation, cross-validation, and variable selection. The second part of the course deals with more advanced machine learning methods including regression and classification trees, random forests, bagging, boosting, and deep neural networks. The course will also introduce causal inference motivated by analogy between double machine learning and two-stage least squares. All the topics will be delivered using a combination of illustrative real data examples and simulations. Students will also gain hands-on experience using R or Python (programming languages and software environments for data analysis, computing and visualisation).
Teaching
15 hours of seminars and 15 hours of lectures in the Autumn Term.
This course has a reading week in Week 6 of Autumn Term.
This course will be delivered through a combination of classes, lectures and Q&A sessions totalling a minimum of 30 hours in Autumn Term.
Students are required to install R/RStudio in their own laptops.
Student not having a laptop of their own will be offered to use personal computers available in seminar rooms.
Formative assessment
Students will be expected to produce 3 problem sets in the AT. The first problem set will be formative and allows students to practice for the second and the third problem set.
Indicative reading
- James, G., Witten, D., Hastie, T. and Tibshirani, R. An Introduction to Statistical Learning with Applications in R. Springer, 2021.
- Hardt, M. and Recht, B. Patterns, Predictions, and Actions: Foundations of Machine Learning. Princeton University Press, 2022.
- Chernozhukov, V., Hansen, C., Kallus, N., Spindler, M., and Syrgkanis, V. Applied Causal Inference Powered by ML and AI. Online, 2024.
- Wickham, H., Çetinkaya-Rundel M., and Grolemund, G. R for Data Science. O'Reilly, 2023.
Assessment
Project (60%)
Continuous assessment (20%)
Continuous assessment (20%)
Students are required to submit a group project by applying machine learning methods covered in this course on some real data using R (which accounts for 60% of the final assessment). In addition to some real data examples, the focus of this course is to introduce some theoretical and methodological concepts in machine learning. These components will be tested by coursework as problem sets (which account for 40% of the final assessment).
Key facts
Department: Statistics
Course Study Period: Autumn Term
Unit value: Half unit
FHEQ Level: Level 6
CEFR Level: Null
Total students 2024/25: 127
Average class size 2024/25: 21
Capped 2024/25: NoCourse selection videos
Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.
Personal development skills
- Self-management
- Team working
- Problem solving
- Application of information skills
- Communication
- Application of numeracy skills
- Specialist skills