Introduction to Data Science and Machine Learning

  • Summer schools
  • Department of Methodology
  • Application code SS-ME314
  • Starting 2020
  • Short course: Open
  • Location: Houghton Street, London

UPDATE: Due to the global COVID-19 pandemic we will no longer be offering this course in summer 2020. Please check our latest news on this situation here.  

You can still register your interest in this course for 2021 using the ‘Sign up’ button to the right.  

Data Science and Big Data Analytics are exciting new areas that combine scientific inquiry, statistical knowledge, substantive expertise, and computer programming. One of the main challenges for businesses and policy makers when using big data is to find people with the appropriate skills. Good data science requires experts that combine substantive knowledge with data analytical skills, which makes it a prime area for social scientists with an interest in quantitative methods.

This course integrates prior training in quantitative methods (statistics) and coding with substantive expertise and introduces the fundamental concepts and techniques of Data Science and Big Data Analytics.

Typical students will be advanced undergraduate and postgraduate students from any field requiring the fundamentals of data science or working with typically large datasets and databases. Practitioners from industry, government, or research organisations with some basic training in quantitative analysis or computer programming are also welcome. Because this course surveys diverse techniques and methods, it makes an ideal foundation for more advanced or more specific training. Our applications are drawn from social, political, economic, legal, and business and marketing fields. 

Session: Three
Dates: 3 August – 21 August 2020
Lecturers: Professor Kenneth Benoit and Dr Jack Blumenau

Programme details

Key facts

Level: 300 level. Read more information on levels in our FAQs

Fees:  Please see Fees and payments

Lectures: 36 hours 

Classes: 18 hours

Assessment*: Two take-home assessments

Typical credit**: 3-4 credits (US) 7.5 ECTS points (EU)

*Assessment is optional

**You will need to check with your home institution

For more information on exams and credit, read Teaching and assessment


Students should already be familiar with quantitative methods at an introductory level, up to linear regression analysis. Familiarity with computer programming or database structures is a benefit, but not formally required.

Programme structure

The course will cover the following topics:

  • an overview of data science and the challenge of working with big data using statistical methods
  • how to integrate the insights from data analytics into knowledge generation and decision-making
  • how to acquire data, both structured and unstructured, and to process it, store it, and convert it into a format suitable for analysis
  • approaches to normalising data, using a database manager (SQLite), and working with SQL database queries
  • the basics of statistical inference including probability and probability distributions, modelling, experimental design

  • an overview of classification methods and related methods for assessing model fit and cross-validating predictive models
  • supervised learning approaches, including linear and logistic regression, decision trees, and naïve Bayes
  • unsupervised learning approaches, including clustering, association rules, and principal components analysis
  • quantitative methods of text analysis, including mining social media and other online resources
  • social network analysis, covering the basics of social graph data and analysing social networks
  • data visualisation through a variety of graphs.

Course outcomes

This course aims to provide an introduction to the data science approach to the quantitative analysis of data using the methods of statistical learning, an approach blending classical statistical methods with recent advances in computational and machine learning. We will cover the main analytical methods from this field with hands-on applications using example datasets, so that students gain experience with and confidence in using the methods we cover. 


LSE’s Department of Methodology is an internationally recognised centre of excellence in research and teaching in the area of social science research methodology. The Department coordinates and provides a focus for methodological activities at the School, in particular in the areas of graduate student (and staff) training and of methodological research.

Through its graduate programmes, and the Department's provision of courses for research students from all parts of the School, the aim is to make the School the pre-eminent centre for methodological training in the social sciences.

On this three week intensive programme, you will engage with and learn from full-time lecturers from the LSE’s methodology faculty.

Reading materials

  • James et al. (2013) An Introduction to Statistical Leaning: With applications in R . Springer.
  • Zumel, N. and Mount, J. (2014). Practical Data Science with R. Manning Publications.

The following are supplemental texts which you may also find useful:

  • Lantz, B. (2013). Machine Learning with R. Packt Publishing.
  • Conway, D. and White, J. (2012) Machine Learning for Hackers . O'Reilly Media.
  • Leskovec, J., Rajaraman, A. and Ullman, J. (2011). Mining of Massive Datasets . Cambridge University Press.
  • Zafarani, R., Abbasi, M. A. and Liu, H. (2014) Social Media Mining: An introduction . Cambridge University Press.

*A more detailed reading list will be supplied prior to the start of the programme

**Course content, faculty and dates may be subject to change without prior notice

Sign up for 2021 updates

Sign up for 2021 updates

  • Please enter a valid email address. We will send you relevant material regarding the LSE Summer School programme.
  • Which course subject area(s) would you like to know more about?
  • Your privacy
    The details you give on this form will be stored on a secure database. LSE Summer School will use your data to send you relevant information about the School and to find out about your experiences of applying to LSE. The data on the form will also be used for monitoring purposes and to track future applications. LSE will not give or sell your details to any other third party organisation. Your data is subject to the LSE website terms and conditions and our Data Protection Policy. You can withdraw from our lists at any time by using the 'unsubscribe/manage email preferences' link that can be found in the footer of each email, or by contacting

How to Apply

Related Programmes

Machine Learning in Practice

Code(s) SS-ME315

Computational Methods in Financial Mathematics

Code(s) SS-ME200

Request a prospectus

  • Name
  • Address

Register your interest

  • Name

Speak to Admissions

Content to be supplied