MG4E1      Half Unit
Algorithmic Techniques for Data Mining

This information is for the 2015/16 session.

Teacher responsible

Dr Laszlo Vegh NAB 3.05

Availability

This course is available on the MSc in Management Science (Decision Sciences) and MSc in Management Science (Operational Research). This course is available as an outside option to students on other programmes where regulations permit.

Pre-requisites

Students are not permitted to take this course alongside ST443 Machine Learning and Data Mining.

Students must have basic knowledge of Mathematics and Statistics, in particular, familiarity with hypothesis testing, linear and logistic regression.

Course content

Data Mining is an interdisciplinary field developed over the last three decades. Vast quantities of data are available today in marketing, other areas of business including demand forecasting, and various fields of science and technology. The main goal of data mining is to extract previously unknown, useful information from such massive scale data. The aim of the course is to equip the students with a theoretically founded and practically applicable knowledge of data mining. The theoretical foundations of the field come from statistics, computer science and artificial intelligence.

The course introduces fundamental methods and algorithms for basic data analytics problems. These methods include algorithms for tree construction and for rule generation, instance-based learning, regression methods, support vector machines, nearest-neighbour methods, Bayesian networks, website ranking, principal component analysis, association rule mining, and distance based and density based clustering.

The methods are illustrated on practical problems arising from various fields. The course also gives an introduction to the usage of the data mining software package Weka.

Teaching

20 hours of lectures and 13 hours and 30 minutes of seminars in the LT. 1 hour and 30 minutes of seminars in the ST.

A reading week will take place in W6. There will be no teaching during this week.

Formative coursework

Students will be expected to produce 1 project in the LT and 1 problem sets in the ST.

A mock exam and a mock project will be given. The mock project will be similar to the group project, but with the dataset provided.

Indicative reading

Main textbook:

I. H. Witten, E. Frank, M. A. Hall: Data Mining - Practical Machine Learning Tools and Techniques.

Further reading:

T. Hastie, R. Tibshirani, J. Friedman: The Elements of Statistical Learning - Data Mining, Inference and Prediction;

P. Flach: Machine Learning: The Art and Science of Algorithms that Make Sense of Data, Cambridge University Press, 2012.

Assessment

Exam (45%, duration: 2 hours) in the main exam period.
Project (45%) in the ST.
Coursework (10%) in the LT.

Key facts

Department: Management

Total students 2014/15: Unavailable

Average class size 2014/15: Unavailable

Controlled access 2014/15: No

Value: Half Unit

Guidelines for interpreting course guide information

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Commercial awareness
  • Specialist skills