ST313      Half Unit
Ethics for Data Science

This information is for the 2022/23 session.

Teacher responsible

Dr Joshua Loftus


This course is available on the BSc in Actuarial Science, BSc in Data Science, BSc in Mathematics and Economics, BSc in Mathematics with Data Science, BSc in Mathematics with Economics, BSc in Mathematics, Statistics and Business and BSc in Politics and Data Science. This course is available with permission as an outside option to students on other programmes where regulations permit and to General Course students.


Students must have completed Elementary Statistical Theory (ST102) or equivalent, Mathematical Methods (MA100) or equivalent, and at least one of MA212, EC220, EC221, ST206, ST202, or equivalent.

Familiarity with basic computer programming in R or Python. Students who have no previous experience in R are strongly encouraged to take on an online pre-sessional R course from the Digital Skill Lab (

Course content

This course covers a selection of topics central to the ethical practice of data science. Students will learn key concepts and methods to analyze a variety of case studies, from the historical and philosophical background of data technologies and ethics to the frontiers of research in machine learning, artificial intelligence, and socio-technical systems. These concepts will include some basic philosophical and legal ideas related to data ethics, frameworks for ethical practice developed by professional societies, formal statistical definitions and quantitative methods for objectives such as fairness and privacy, and an emphasis on the use of causal reasoning to evaluate data-driven systems and policies. Topics may include:

  • Replication crisis, unfair algorithms, basics of normative ethics and causality
  • Historical examples, professional ethical guidelines
  • Transparency, reproducibility, open science
  • Discrimination, statistical fairness, impossibility results
  • Causal reasoning for fairness, pathway analysis, intersectionality
  • Interventions, policy optimization, distributive justice
  • Data provenance, privacy, differential privacy
  • Strategic behavior, surveillance, democratic data
  • Automation and AI, responsibility, complicity

Causal statistical models will be used as a formal framework throughout to understand and stress test these ideas.


20 hours of lectures and 10 hours of classes in the MT.

Formative coursework

Students will be expected to produce 4 problem sets in the MT.

Note that two of the problem sets will be graded and summative.

Indicative reading

Lecture notes will be provided. These will be supplemented with a variety of short readings, some of which will be taken from the following background references



Group project (50%) in the LT Week 2.
Group presentation (20%) in the MT Week 9.
Problem sets (15%) in the MT Week 7.
Problem sets (15%) in the MT Week 10.

Two problem sets during the MT will be summative, each with 15% marks. Group work consists of a presentation during the MT describing a project proposal, and the project itself will then be due in the LT.

Key facts

Department: Statistics

Total students 2021/22: Unavailable

Average class size 2021/22: Unavailable

Capped 2021/22: No

Value: Half Unit

Guidelines for interpreting course guide information

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Commercial awareness
  • Specialist skills