Statistics Seminar Series

Seminars take place in the Leverhulme Library (6.15), located on the sixth floor of Columbia House.

Statistics is all about getting data and analysing it and using it to answer questions about the world be that in terms of Economics, Finance or public opinions. The applications are numerous

The Department of Statistics hosts Statistics seminars throughout the year. Seminars usually take place on Friday afternoons at 2pm in the Leverhulme Library (unless stated otherwise) with refreshments preceding at 1.30pm.

All are very welcome to attend! Please contact Kayleigh Brewer at for further information about any of these seminars below.

Seminars in Lent Term 2019 

Friday 13th March, 2pm - Dr. Qingyuan Zhao, University of Cambridge 

More details will be confirmed nearer the date.

Please see more information here:

Friday 28th February, 2pm - Dr Judith Rousseau, University of Oxford 

More details will be confirmed nearer the date.

Please see more information here:

Friday 14th February, 2pm - Dr Konstantinos Fokianos, University of Lancaster 

More details will be confirmed nearer the date. 

Please see more information here:

Friday 31st January, 2pm - Piotr Zwiernik, University Pompeu Fabra, Barcelona

More details will be confirmed nearer the date.

Please see more information here:



 Seminars in Michealmas Term 2019 

Friday 13th December, 2pm - Jinchi Lv, University of Southern California 

Please find more information about the speaker here:

Friday 6th December, 2pm - Yoav Zemel,  University of Cambridge 

Please find more information about the speaker here:

Friday 22nd November, 2pm - Paolo Zaffaroni, Imperial College London 

More details will be confirmed nearer the time. 

Please find more information on the speaker here:


  Seminars in Summer Term 2019

Wednesday 5th June 2019, 2pm - Moulinath Banerjee, University of Michigan

Title:  Recent Developments in the Study of Single-Index Type Models


Single-index type models are popular in statistics, biostatistics and economics as they alleviate the curse of dimensionality to a considerable extent while allowing for broad classes of models through the dependence on an unknown link function. Various classes of single index models are known: in regular models under a fixed dimension setting [i.e. fixed number of regression parameters], the regression parameter is $\sqrt{n}$ estimable; under current status type censoring of the response variable in a linear regression model -- which leads to the binary choice model -- one gets a generalized single-index model where the rate of estimation is at most $n^{1/3}$, a problem well-studied in the econometrics literature (by Manski and subsequent authors). Single index structures with a discontinuous link function arise in models involving change-planes in multidimensional space -- hyperplanes that separate two (or more) response or survival regimes -- and are relevant to applications in personalized medicine and dynamic treatment regimes. Here, rates of estimation can easily exceed $\sqrt{n}$ in the finite dimensional case. 

I will talk about some of my recent work in the above class of models in growing and high dimensional settings focusing on how growing dimensions 

introduce significantly new challenges at both theoretical and computational levels and present some recent results on convergence, minimax-optimal rates, and inference, as well as future challenges. 

The talk is based on joint work with Ya'acov Ritov, Hamid Eftekhari, Zhiyuan Lu and Debarghya Mukherjee.


 Seminars in Lent Term 2019

Friday 8th March 2019, 2pm - Roberto Casarin, University Ca Foscari, Venice

Title: Bayesian Dynamic Tensor Regression


Multidimensional arrays (i.e. tensors) of data are becoming increasingly available and call for suitable econometric tools. We propose a new dynamic linear regression model for tensor-valued response variables and covariates that encompasses some well known multivariate models such as SUR, VAR, VECM, panel VAR and matrix regression models as special cases. For dealing with the over-parametrization and over-fitting issues due to the curse of dimensionality, we exploit a suitable parametrization based on the parallel factor (PARAFAC) decomposition which enables to achieve both parameter parsimony and to incorporate sparsity effects. Our contribution is twofold: first, we provide an extension of multivariate econometric models to account for both tensor-variate response and covariates; second, we show the effectiveness of proposed methodology in defining an autoregressive process for time-varying real economic networks. Inference is carried out in the Bayesian framework combined with Monte Carlo Markov Chain (MCMC). We show the efficiency of the MCMC procedure on simulated datasets, with different size of the response and independent variables, proving computational efficiency even with high-dimensions of the parameter space. Finally, we apply the model for studying the temporal evolution of real economic networks.

Friday 8th February 2019, 2pm - Heather Battey, Imperial college

Title: On sparsity scales and covariance matrix transformations


In many statistical contexts, for example in linear regression and discriminant analysis, a covariance or concentration matrix is a nuisance parameter, distinct from interest parameters which should always have a direct subject-matter interpretation. It seems sensible to model explicitly only those aspects of direct concern and retain a level of agnosticism over other aspects.This has important implications in high-dimensional estimation problems, in which an assumption of sparsity is critical.

I will introduce continua of sparsity scales for covariance matrices, leading to sparsity on the original, inverse and matrix logarithmic scales as special cases. After discussing some special features of the logarithmic scale, I will present a theory of estimation appropriate for any given or estimated scale when the matrix dimension is larger than the sample size.

A corollary of the work is that a constrained optimization-based approach is unnecessary for estimating a sparse concentration matrix. Some open theoretical problems over misspecified sparsity structure are highlighted, with insights from simulations.

Friday 25th January 2019, 2pm - Christophe Ley, Ghent University 

Title: Skew and multi-tailed multivariate distributions - a need in finance


It is a well-known fact that financial data exhibit heavy tails and, often, skewness. As response to the fallacies of the multinormal distribution to model financial data, the class of elliptically symmetric distributions (including the multivariate t-distribution) has been widely accepted as it allows for heavier-than-normal tails.

However, in situations where negative returns are much more extreme than positive returns, the assumption of elliptical symmetry is too restrictive. Moreover, a further restriction of elliptical distributions lies in the fact that they are governed by a scalar radial function, which implies that the tails are governed by a one-dimensional tail-weight parameter like in the multivariate t distribution.

In this talk I will first present new efficient tests for elliptical symmetry against skew-ellipticity based on the Le Cam theory of asymptotic experiments. With these new tests, I shall analyze financial data consisting of daily returns data of several major worldwide indexes. In the second part of my talk, I will present various models of flexible multivariate distributions from the literature and compare them in the light of the needs of financial data. This comparison is based both on properties of the distributions and a simulation study.

This is joint work with Sladjana Babic, Marc Hallin and David Veredas.


Seminars in Michaelmas Term 2018  

Friday 7th December 2018, 2pm - Alexandros Beskos, UCL 

Title: Geometric MCMC for Bayesian Inverse Problems


Bayesian Inverse Problems often involve sampling posterior distributions on infinite-dimensional function spaces. Traditional Markov chain Monte Carlo (MCMC) algorithms are characterized by deteriorating mixing times upon mesh-refinement, when the finite-dimensional approximations become more accurate. Such methods are typically forced to reduce step-sizes as the discretization gets finer, thus are expensive as a function of dimension. Recently, a new class of MCMC methods with mesh-independent convergence times has emerged. However, few of them take into account the geometry of the posterior informed by the data. At the same time, recently developed geometric MCMC algorithms have been found to be powerful in exploring complicated distributions that deviate significantly from elliptic Gaussian laws, but are in general computationally intractable for models defined in infinite dimensions. In this work, we combine geometric methods on a finite-dimensional subspace with mesh-independent infinite-dimensional approaches. Our objective is to speed up MCMC mixing times, without significantly increasing the computational cost per step (for instance, in comparison with the vanilla preconditioned Crank–Nicolson (pCN) method). This is achieved by using ideas from geometric MCMC to probe the complex structure of an intrinsic finite-dimensional subspace where most data information concentrates, while retaining robust mixing times as the dimension grows by using pCN-like methods in the complementary subspace. The resulting algorithms are demonstrated in the context of three challenging Inverse Problems arising in subsurface flow, heat conduction and incompressible flow control. The algorithms exhibit up to two orders of magnitude improvement in sampling efficiency when compared with the pCN method.

Friday 30th November 2018, 2pm - Ziwei Zhu, University of Cambridge 

Title: Distributed estimation of principal eigenspaces


 Modern data sets are often decentralized; they are generated and stored in multiple sources across which the communication is constrained by bandwidth or privacy. This talk focuses on distributed estimation of principal eigenspaces of covariance matrices with decentralized data. We introduce and analyze a distributed algorithm that aggregates multiple principal eigenspaces through averaging the corresponding projection matrices. When the data distribution has sign-symmetric innovation, the distributed PCA is proved to be “unbiased” such that its statistical error will converge to zero as the number of data splits grows to infinity. For general distributions, when the number of data splits is not large, this algorithm is shown to achieve the same statistical efficiency as the full-sample oracle. We applied our algorithm to implement distributed partition of traffic network of Manhattan; the distributed procedure delivered similar partition results as the centralized procedure provided that the number of data splits is not large.

Friday 23rd November 2018, 2pm - Yi Yu, University of Bristol

Title: Optimal change point detection and localization in Sparse dynamic networks.

Abstract: We study the problem of change point detection and localization in dynamic networks. We assume that we observe a sequence of independent adjacency matrices of given size, each corresponding to one realization from an unknown inhomogeneous Bernoulli model. The underlying distribution of the adjacency matrices may change over a subset of the time points, called change points. Our task is to recover with high accuracy the unknown number and positions of the change points. Our generic model setting allows for all the model parameters to change with the total number of time points, including the network size, the minimal spacing between consecutive change points, the magnitude of the smallest change and the degree of sparsity of the networks. We first identify an impossible region in the space of the model parameters such that no change point estimator is provably consistent if the data are generated according to parameters falling in that region. We propose a computationally simple novel algorithm for network change point localization, called Network Binary Segmentation, which relies on weighted averages of the adjacency matrices. We show that Network Binary Segmentation is consistent over a range of the model parameters that nearly cover the complement of the impossibility region, thus demonstrating the existence of a phase transition for the problem at hand. Next, we devise a more sophisticated algorithm based on singular value thresholding, called Local Refinement, that delivers more accurate estimates of the change point locations. We show that, under appropriate conditions, Local Refinement guarantees a minimax optimal rate for network change point localization while remaining computationally feasible.

Friday 26th October 2018, 2pm - Tengyao Wang, Cambridge/UCL

Title: Isotonic regression in general dimensions

Abstract: We study the least squares regression function estimator over the class of real-valued functions on $[0,1]^d$ that are increasing in each coordinate.  For uniformly bounded signals and with a fixed, cubic lattice design, we establish that the estimator achieves the minimax rate of order $n^{-\min\{2/(d+2),1/d\}}$ in the empirical $L_2$ loss, up to poly-logarithmic factors.  Further, we prove a sharp oracle inequality, which reveals in particular that when the true regression function is piecewise constant on $k$ hyperrectangles, the least squares estimator enjoys a faster, adaptive rate of convergence of $(k/n)^{\min(1,2/d)}$, again up to poly-logarithmic factors.

Previous results are confined to the case $d \leq 2$.  Finally, we establish corresponding bounds (which are new even in the case $d=2$) in the more challenging random design setting.  There are two surprising features of these results: first, they demonstrate that it is possible for a global empirical risk minimisation procedure to be rate optimal up to poly-logarithmic factors even when the corresponding entropy integral for the function class diverges rapidly; second, they indicate that the adaptation rate for shape-constrained estimators can be strictly worse than the parametric rate.

Friday 12th October 2018, 2pm - Qi-Man Shao, The Chinese University of Hong Kong

Title: Are you sure you can use your estimated p-value? 

Abstract: Abstract: p-value is probably the most important figure in statistical hypothesis test. However, the true p-value is unknown most of the time and it is common practice to use the limiting distribution of a test statistics to estimate the p-value. How accurate if the estimated p-value? Are the true and estimated p-value really close? In this talk we reveal the secret of the relative error of the estimated p-value against the true p-value for some well-known statistics.