##
ST444 Half Unit

Statistical Computing

**This information is for the 2015/16 session.**

**Teacher responsible**

Dr Yining Chen

**Availability**

This course is available on the MSc in Statistics, MSc in Statistics (Financial Statistics), MSc in Statistics (Financial Statistics) (Research) and MSc in Statistics (Research). This course is available with permission as an outside option to students on other programmes where regulations permit.

**Course content**

An introduction to the use of numerical linear algebra and optimisation in statistical computation, followed by their applications in parametric statistical methods, including least squares, maximum likelihood, generalized linear modelling, LASSO, etc. We then present selected topics in computational methods in nonparametric statistics, including kernel density estimation and splines. If time permits, more advanced topics such as EM and simulated annealing will also be covered. Throughout the course, students will gain practical experience of implementing these computational methods in a programming language. Learning support will be provided for at least one programming language, such as C++ or Python, but the choice of language supported may vary between years, depending on judged benefits to students, whether in terms of pedagogy or resulting skills.

**Teaching**

20 hours of lectures and 10 hours of computer workshops in the MT.

i. Introduction to Tools in Numerical Analysis: linear algebra (Gaussian elimination, Cholesky decomposition, singular value decomposition, QR decomposition, matrix inversion and condition); numerical optimization (golden section, steepest descent, Newton’s method, Quasi-Newton methods, stochastic search); convex optimization (coordinate descent, shor’s algorithm, etc.)

ii. Applications to Parametric Statistics: linear regression and least squares (using numerical linear algebra); generalized linear models (using numerical optimization); iteratively reweighted least squares and generalized linear models; Lasso (using convex optimization)

iii. Applications to Nonparametric Statistics and beyond: density estimation (kernel density estimation, fast fourier transform, choice of bandwidth); regression function estimation ( spline smoothing, computation of splines using numerical linear algebra, principle component analysis using numerical linear algebra)

iv. Other more advanced topics if time permits: EM algorithm; simulated annealing; bootstrapping; cross validation

Workshops on implementation using programming language, such as C++ or Python, including statements, expressions, data types, control flow statements, functions, strings, lists, input, output, using external modules (libraries).

**Formative coursework**

Students will be expected to produce 9 problem sets in the MT.

Weekly exercises, usually involving programming.

**Indicative reading**

Computational Statistics by Givens and Hoeting

Statistical computing in C++ and R by Eubank and Kupresanin

Introduction to C++ for Financial Engineers: An Object-Oriented Approach by Duffy

The Art of R Programming: A Tour of Statistical Software Design by Matloff

Think Python: How to Think Like a Computer Scientist by Downey

**Assessment**

Exam (70%, duration: 1 hour and 50 minutes, reading time: 10 minutes) in the main exam period.

Project (30%) in the MT.

** Key facts **

Department: Statistics

Total students 2014/15: Unavailable

Average class size 2014/15: Unavailable

Controlled access 2014/15: No

Value: Half Unit

**Personal development skills**

- Self-management
- Team working
- Problem solving
- Application of information skills
- Communication
- Application of numeracy skills