ST115      Half Unit
Managing and Visualising Data

This information is for the 2025/26 session.

Course Convenor

Christine Yuen

Availability

This course is compulsory on the BSc in Data Science. This course is available on the BSc in Accounting and Finance, BSc in Actuarial Science, BSc in Actuarial Science (with a Placement Year), BSc in Finance, Erasmus Reciprocal Programme of Study and Exchange Programme for Students from University of California, Berkeley. This course is available with permission as an outside option to students on other programmes where regulations permit. This course is available with permission to General Course students.

This course has a limited number of places (it is capped). Students who have this course as a compulsory course are guaranteed a place. Places for all other students are allocated on a first come first served basis.

Requisites

Additional requisites:

Students who have no previous experience in Python are required to take an online pre-sessional Python course from the Digital Skills Lab.

Students should have taken, or be taking concurrently, a first course in statistics such as Elementary Statistical Theory (ST102), Elementary Statistical Theory I (ST109) or Quantitative Methods (Statistics) (ST107)

Course content

The course focuses on the fundamental principles of effective manipulation and visualisation of data. This will cover the key steps of a data analytics pipeline, starting with the formulation of a data science problem, going through collection, manipulation and visualisation of data, and, finally, creating actionable insights. 
The topics covered include gathering data using API and web scraping; methods for data cleaning and transformation, manipulation of data using tabular data structures, relational database models, structured query languages (e.g. SQL); processing of various human-readable data formats (e.g. JSON and XML); data visualisation methods for exploratory and explanatory data analysis using various statistical plots such as histograms and boxplots, data visualisation plots for time series data, multivariate data, etc.
The course will cover basic concepts and principles and will enable students to gain hands-on experience in using Python programming for the manipulation and visualisation of data. This will include the use of standard modules and libraries such as NumPy, Pandas, Matplotlib and Seaborn, and programming environments such as Jupyter Notebook.
The course will use examples drawn from a wide range of applications such as those that arise in online services, social media, social networks, finance, and machine learning. The principles and methods learned will enable students to effectively derive insights from data and communicate results to end users.

Teaching

15 hours of seminars and 20 hours of lectures in the Winter Term.

This course will be delivered through a combination of classes and lectures totalling a minimum of 35 hours in Winter Term.

Students are required to install Python and other required software on their own laptops and use their own laptops in the lectures and classes.

Formative assessment

Students will be expected to produce 4 exercises in the WT.

Indicative reading

Essential Reading:

  1. W. Mckinney, Python for Data Analysis, 3rd Edition, O’Reilly 2022
  2. A. C. Muller and S. Guido, Introduction to Machine Learning with Python, O’Reilly, 2016
  3. R. Ramakrishnan and J. Gehrke, Database Management Systems, 3rd Edition, McGraw Hill, 2002
  4. A. Cairo, The truthful art: Data, charts, and maps for communication, New Riders, 2016

Additional Reading: 

  1. NumPy, https://numpy.org
  2. Pandas, https://pandas.pydata.org
  3. Matplotlib, https://matplotlib.org
  4. Seaborn: statistical data visualization, https://seaborn.pydata.org

Assessment

Exam (40%), duration: 90 Minutes, reading time: 10 minutes in the Spring exam period

Project (50%) in Spring Term Week 1

Continuous assessment (10%)


Key facts

Department: Statistics

Course Study Period: Winter Term

Unit value: Half unit

FHEQ Level: Level 4

CEFR Level: Null

Total students 2024/25: 85

Average class size 2024/25: 28

Capped 2024/25: No
Guidelines for interpreting course guide information

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Specialist skills