ST313      Half Unit
Ethics for Data Science

This information is for the 2025/26 session.

Course Convenor

Christine Yuen

Kaifang Zhou

Availability

This course is available on the BSc in Actuarial Science, BSc in Actuarial Science (with a Placement Year), BSc in Data Science, BSc in Mathematics and Economics, BSc in Mathematics with Data Science, BSc in Mathematics with Economics, BSc in Mathematics, Statistics and Business, Erasmus Reciprocal Programme of Study and Exchange Programme for Students from University of California, Berkeley. This course is available with permission as an outside option to students on other programmes where regulations permit. This course is available with permission to General Course students.

Requisites

Pre-requisites:

Before taking this course, students must have completed: ST102 and MA100 and (MA222 or ST206 or ST202)

Course content

This course covers a selection of topics central to the ethical practice of data science. Students will learn key concepts and methods to analyze a variety of case studies, from the historical and philosophical background of data technologies and ethics to the frontiers of research in machine learning, artificial intelligence, and socio-technical systems. These concepts will include some basic philosophical and legal ideas related to data ethics, frameworks for ethical practice developed by professional societies, formal statistical definitions and quantitative methods for objectives such as fairness and privacy, and an emphasis on the use of causal reasoning to evaluate data-driven systems and policies. Topics may include:

  • Replication crisis, unfair algorithms, basics of normative ethics and causality
  • Historical examples, professional ethical guidelines
  • Transparency, reproducibility, open science
  • Discrimination, statistical fairness, impossibility results
  • Causal reasoning for fairness, pathway analysis, intersectionality
  • Interventions, policy optimization, distributive justice
  • Data provenance, privacy, differential privacy
  • Strategic behavior, surveillance, democratic data
  • Automation and AI, responsibility, complicity

Causal statistical models will be used as a formal framework throughout to understand and stress test these ideas.

Teaching

10 hours of lectures and 20 hours of classes in the Autumn Term.

Indicative reading

Lecture notes will be provided. These will be supplemented with a variety of short readings, some of which will be taken from the following background references

  • https://www.bitbybitbook.com/en/1st-ed/ethics/
  • https://fairmlbook.org/
  • https://data-feminism.mitpress.mit.edu/
  • https://aiethics.princeton.edu/case-studies/
  • https://www.acm.org/code-of-ethics
  • https://rss.org.uk/RSS/media/News-and-publications/Publications/Reports%20and%20guides/A-Guide-for-Ethical-Data-Science-Final-Oct-2019.pdf
  • https://www.amstat.org/ASA/Your-Career/Ethical-Guidelines-for-Statistical-Practice.aspx
  • https://hastie.su.domains/CASI/
  • https://www.statlearning.com/

Assessment

Quiz (30%) in Autumn Term Week 11

Course participation (20%)

Project (50%)

Group work consists of a project proposal in AT, and the project itself will then be due in the WT.


Key facts

Department: Statistics

Course Study Period: Autumn Term

Unit value: Half unit

FHEQ Level: Level 6

CEFR Level: Null

Total students 2024/25: 51

Average class size 2024/25: 26

Capped 2024/25: No
Guidelines for interpreting course guide information

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Commercial awareness
  • Specialist skills