ST201      Half Unit
Statistical Models and Data Analysis

This information is for the 2022/23 session.

Teacher responsible

Dr Yunxiao Chen COL 5.16


This course is available on the BSc in Accounting and Finance. This course is available as an outside option to students on other programmes where regulations permit and to General Course students.

Also available to students who have studied statistics and mathematics to the level of MA107/ST107 Quantitative Methods or equivalent.

This course cannot be taken with ST211 Applied Regression or DS202 Data Science for Social Scientists. 

This course is not controlled access. If you request a place and meet the criteria you are likely to be given a place.


Quantitative Methods (MA107/ST107) or equivalent.

"Previous programming experience is not required but students who have no previous experience in R must complete an online pre-sessional R course from the Digital Skills Lab before the start of the course (”

Course content

A second course in statistics with an emphasis on data analysis with applications in the social sciences. Students will gain hands on experience using R-- a programming language and software environment for data analysis and visualisation. The course contains five topics, including (1) principles of statistical analysis, including data preparation, statistical models, regression and classification, inference, prediction, and bias-variance tradeoff, (2) multiple linear regression, including its assumptions, inference, data transformations, diagnostics, model selection, (3) regression tree method, (4) logistic regression, including odds ratios, likelihood, classification, and ROC curve, and (5) Bayes rule for classification and linear discriminant analysis.


This course will be delivered through a combination of classes, lectures and Q&A sessions, totalling a minimum of 36 hours across Lent Term and 2 hours of lecture in the Summer Term. Students will be given their assessed project in week 9 which is due in Week 1 of ST.

Formative coursework

Exercise questions in computer workshops and a quantitative research project.

Indicative reading

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. New York, NY: Springer. 

Fox, J. (2015). Applied regression analysis and generalized linear models. Thousand Oaks, CA: Sage Publications.


Exam (80%, duration: 2 hours) in the summer exam period.
Coursework (20%) in the LT.

Student performance results

(2019/20 - 2021/22 combined)

Classification % of students
First 63.9
2:1 25.9
2:2 7.5
Third 2
Fail 0.7

Key facts

Department: Statistics

Total students 2021/22: 28

Average class size 2021/22: 9

Capped 2021/22: No

Value: Half Unit

Guidelines for interpreting course guide information

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Personal development skills

  • Leadership
  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Application of numeracy skills
  • Specialist skills