ST207      Half Unit
Databases

This information is for the 2025/26 session.

Course Convenor

Dr Marcos Barreto

Availability

This course is compulsory on the BSc in Data Science. This course is available on the BSc in Actuarial Science, BSc in Actuarial Science (with a Placement Year), BSc in Mathematics with Data Science, BSc in Mathematics, Statistics and Business, Erasmus Reciprocal Programme of Study and Exchange Programme for Students from University of California, Berkeley. This course is freely available as an outside option to students on other programmes where regulations permit. It does not require permission. This course is freely available to General Course students. It does not require permission.

This course has a limited number of places (it is capped). Students who have this course as a compulsory course are guaranteed a place. Places for all other students are allocated on a first come first served basis.

Requisites

Additional requisites:

A computer programming course using Python, e.g. a pre-sessional course or Programming for Data Science (ST101).

Course content

The goal of this course is to cover concepts of database management systems, including relational and non-relational databases.

The topics covered will include: Relational database design; Structured Query Language (SQL) for database implementation and manipulation; Integrity constraints, triggers and database views; Concurrency control and recovery mechanisms; Multimedia and spatiotemporal databases; NoSQL databases (key-value stores, document, and graph databases; Vector databases and large language models (LLMs) applied to databases.

The course will demonstrate how various theoretical principles are implemented in practice in a database management system, such as MySQL or SQLite, and also in NoSQL, multimedia, spatiotemporal and vector database software.

Teaching

15 hours of seminars and 20 hours of lectures in the Autumn Term.

This course has a reading week in Week 6 of Autumn Term.

 

Students are required to use their own laptops and install Python and other tools (under guidance of the teaching staff) on their laptops.

Formative assessment

Students will be expected to produce 10 exercises in the AT.

A set of exercises will be given in each teaching week.

 

Indicative reading

Essential Reading: 

  • R. Elmasri and S. B. Navathe. Fundamentals of Database Systems, 7th edition (Global Edition). Pearson, 2016.
  • G. Powell. Database Modeling Step-by-Step, CRC Press. Taylor & Francis, 2019.
  • A. Beaulieu. Learning SQL: generate, manipulate, and retrieve data, 3rd edition. O'Reilly, 2020.
  • E. Foster and S. Godbole. Database Systems: a pragmatic approach, 3rd edition. CRC Press, 2023.
  • E. Sciore. Database Design and Implementation. 2nd edition. Springer, 2020.
  • S. Bradshaw, E. Brazil, K. Chodorow. MongoDB: the definitive guide, 3rd edition. O’Reilly, 2019.
  • I. Robinson and J. Webber and E. Eifrem. Graph Databases. 2nd edition. O’Reilly, 2015.

Additional Reading:

  • P. Zhang. Practical Guide to Oracle SQL, T-SQL and MySQL, CRC Press. Taylor & Francis, 2018.
  • A. Meier and M. Kaufmann. SQL & NoSQL Databases: models, languages, consistency options and architectures for big data management. Springer Vieweg, 2019.
  • S. Bagui and R. Earp. Database Design using Entity-Relationship Diagrams, 3rd edition. CRC Press, 2023.
  • C. Garrard. Geoprocessing with Python. Manning Publications, 2016.
  • B. McClain. Python for Geospatial Data Analysis. O'Reilly, 2022.
  • B. Prabhakaran. Multimedia Database Management Systems. Springer. 2012.
  • J. Pan, J. Wang, and G. Li. Survey of Vector Database Management Systems. 2023. https://doi.org/10.48550/arXiv.2310.14021

Assessment

Project (60%) in January

This component of assessment includes an element of group work.

Problem sets (40%) in November

Students are required to hand in solutions to 2 individual sets of exercises, each accounting for 20% of the final assessment. The group project will require solving a practical task involving database modelling and creation, data loading and utilisation (querying). The students are encouraged to work with relational and non-relational database tools, real and/or synthetic data, and design a database application that approximates a real scenario.


Key facts

Department: Statistics

Course Study Period: Autumn Term

Unit value: Half unit

FHEQ Level: Level 5

CEFR Level: Null

Total students 2024/25: 81

Average class size 2024/25: 27

Capped 2024/25: No
Guidelines for interpreting course guide information

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Commercial awareness
  • Specialist skills