ST447      Half Unit
Data Analysis and Statistical Methods

This information is for the 2021/22 session.

Teacher responsible

Prof Qiwei Yao


This course is compulsory on the MSc in Data Science, MSc in Health Data Science and MSc in Operations Research & Analytics. This course is available with permission as an outside option to students on other programmes where regulations permit.

This course is NOT available on the following programmes: MSc in Statistics, MSc in Statistics (Research), MSc in Statistics (Financial Statistics), MSc in Statistics (Financial Statistics) (Research), MSc in Statistics (Social Statistics), MSc in Statistics (Social Statistics) (Research) or LSE-Fudan Double Master's in Financial Statistics and Chinese Economy.

This course has a limited number of places (it is controlled access) and demand is typically high. This may mean that you’re not able to get a place on this course.


Basic knowledge in calculus and linear algebra, as well as a course in probability and statistics equivalent to ST102.

Students who have no previous experience in R are required to take on an online pre-sessional R course from the Digital Skill Lab (

Course content

This course covers most frequently used statistical methods for data analysis. In addition to the standard inference methods such as parameter estimation, hypothesis testing, linear models and logistic regression, it also covers Monte Carlo methods, bootstrap, EM-algorithm, permutation tests, regression based on local fittting, causal inference and false discovery rates. The software R constitutes an integral part of the course, providing hands-on experience of data analysis.


This course will be delivered through a combination of classes, lectures and Q&A sessions totalling a minimum of 30 hours across in Michaelmas Term. This year, some of this teaching may be delivered through a combination of virtual classes and flipped-lectures delivered as short online videos. This course includes a reading week in Week 6 of Michaelmas Term.

Formative coursework

Students will be expected to produce 5 exercises in the MT.

The bi-weekly exercises enable students to learn about the different methods of statistics and data analysis. They also provide students the opportunities to implement statistical methods in R.

Indicative reading

All of Statistics, by Larry Wasserman, Springer.

Data Analysis and Graphics using R: an Example-based Appoach, by John Maindonald an John Braun, Cambridge University Press.


Exam (85%, duration: 2 hours) in the January exam period.
Project (15%) in the MT.

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Student performance results

(2017/18 - 2019/20 combined)

Classification % of students
Distinction 36.5
Merit 34.1
Pass 25.4
Fail 4

Important information in response to COVID-19

Please note that during 2021/22 academic year some variation to teaching and learning activities may be required to respond to changes in public health advice and/or to account for the differing needs of students in attendance on campus and those who might be studying online. For example, this may involve changes to the mode of teaching delivery and/or the format or weighting of assessments. Changes will only be made if required and students will be notified about any changes to teaching or assessment plans at the earliest opportunity.

Key facts

Department: Statistics

Total students 2020/21: 68

Average class size 2020/21: 23

Controlled access 2020/21: Yes

Value: Half Unit

Guidelines for interpreting course guide information

Personal development skills

  • Self-management
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Specialist skills