ST201      Half Unit
Statistical Models and Data Analysis

This information is for the 2021/22 session.

Teacher responsible

Dr Yunxiao Chen COL 5.16


This course is available on the BSc in Accounting and Finance. This course is available as an outside option to students on other programmes where regulations permit and to General Course students.

Also available to students who have studied statistics and mathematics to the level of MA107/ST107 Quantitative Methods or ST108 Statistical Methods for the Social Sciences or equivalent.

This course cannot be taken with ST211 Applied Regression or DS202 Data Science for Social Scientists.


MA107/ST107 Quantitative Methods or ST108 Statistical Methods for the Social Sciences or equivalent.

Course content

A second course in statistics with an emphasis on data analysis with applications in the social sciences. Students will gain hands on experience using R-- a programming language and software environment for data analysis and visualisation. The course contains five topics, including (1) principles of statistical analysis, including data preparation, statistical models, regression and classification, inference, prediction, and bias-variance tradeoff, (2) multiple linear regression, including its assumptions, inference, data transformations, diagnostics, model selection, (3) regression tree method, (4) logistic regression, including odds ratios, likelihood, classification, and ROC curve, and (5) Bayes rule for classification and linear discriminant analysis.


This course will be delivered through a combination of classes, lectures and Q&A sessions, totalling a minimum of 36 hours across Lent Term and 4 hours of lectures in the Summer Term. This year, some of this teaching may be delivered through a combination of classes and flipped-lectures delivered as short online videos. This course includes a reading week in Week 6 of Lent Term. Students will be given their assessed project to start on in week 6 which is due in at the end of LT.

Formative coursework

Moodle quizzes and a quantitative research project.

Indicative reading

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. New York, NY: Springer. 

Fox, J. (2015). Applied regression analysis and generalized linear models. Thousand Oaks, CA: Sage Publications.


Exam (80%, duration: 2 hours) in the summer exam period.
Coursework (20%) in the LT.

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Student performance results

(2018/19 - 2020/21 combined)

Classification % of students
First 64.3
2:1 27.7
2:2 6.8
Third 1.3
Fail 0

Important information in response to COVID-19

Please note that during 2021/22 academic year some variation to teaching and learning activities may be required to respond to changes in public health advice and/or to account for the differing needs of students in attendance on campus and those who might be studying online. For example, this may involve changes to the mode of teaching delivery and/or the format or weighting of assessments. Changes will only be made if required and students will be notified about any changes to teaching or assessment plans at the earliest opportunity.

Key facts

Department: Statistics

Total students 2020/21: 49

Average class size 2020/21: 8

Capped 2020/21: No

Value: Half Unit

Guidelines for interpreting course guide information

Personal development skills

  • Leadership
  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Application of numeracy skills
  • Specialist skills