MA429      Half Unit
Algorithmic Techniques for Data Mining

This information is for the 2017/18 session.

Teacher responsible

Professor Gregory Sorkin


This course is available on the MSc in Applicable Mathematics, MSc in Marketing and MSc in Operations Research & Analytics. This course is available as an outside option to students on other programmes where regulations permit.

The course will be capped to 36 students.


Students are not permitted to take this course alongside ST443, Machine Learning and Data Mining.

Students must have knowledge of  Statistics and the programming language R to the level of ST447, Data Analysis and Statistical Methods.

Course content

Data Mining is an interdisciplinary field developed over the last three decades. Vast quantities of data are available today in all areas of business, science, and technology. The main goal of data mining is to extract previously unknown, useful information from such massive scale data. The aim of the course is to equip the students with a theoretically founded and practically applicable knowledge of data mining. The theoretical foundations of the field come from mathematics, statistics, computer science and artificial intelligence.

The course introduces fundamental machine learning methods and algorithms for basic data analytics problems. These methods include algorithms for classification and regression problems, such as tree construction, support vector machines, nearest-neighbour methods, Bayesian networks.  The course will also cover unsupervised learning methods such as association rule mining  association rule mining and clustering.

The methods are illustrated on practical problems arising from various fields. The course will use data mining packages in R.


20 hours of lectures and 15 hours of seminars in the LT. 2 hours of lectures in the ST.

Formative coursework

There will be weekly homework assignments, some of which will be submitted for formative feedback, and some for summative assessment (10% of the course mark).  A mock project will be given, as preparation for the summative group project. 

Indicative reading

James, Witten, Hastie, Tibshirani, An Introduction to Statistical Learning:  with Applications in R (2016)

Torgo, Data Mining with R: Learning with Case Studies (2010)

Hastie, Tibshirani, Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction, Second Edition  (2009)


Exam (40%, duration: 2 hours) in the main exam period.
Project (50%) in the ST.
Coursework (10%) in the LT.

Key facts

Department: Mathematics

Total students 2016/17: Unavailable

Average class size 2016/17: Unavailable

Controlled access 2016/17: No

Value: Half Unit

Guidelines for interpreting course guide information

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Commercial awareness
  • Specialist skills