Home > Department of Statistics > Events > 2014-15 Seminar Series > Statistics Seminar Series 2014-15


Department of Statistics

Columbia House

London School of Economics

Houghton Street




General enquiries about events and seminars in the Department of Statistics

Email: statistics.events@lse.ac.uk|


Enquiries about undergraduate and postgraduate course programmes in the Department of Statistics


Online query form|


Frequently asked questions|


BSc Queries

+44 (0)20 7955 7650


MSc Queries

+44 (0)20 7955 6879 


MPhil/PhD Queries

+44 (0)20 7955 751


Statistics Seminar Series 2014-15

The Department of Statistics hosts statistics seminars throughout the year. Seminars take place on Friday afternoons at 2pm, unless otherwise stated, in the Leverhulme Library (COL 6.15, Columbia House). All are very welcome to attend. Please contact Events| for further information about any of these seminars 

Details of the 2014-15 Statistics Seminar Series will be published here as they are confirmed.

GPlobidisFriday 17 October 2014, 2pm - 3pm, Room COL 6.15, Columbia House (sixth floor)
Maps and directions|

George Ploubidis
Institute of Education, University of London|

Title: Psychological distress in mid-life in 1958 and 1970 cohorts: the role of childhood experiences and behavioural adjustment

Abstract:  This paper addresses the levels of psychological distress experienced in mid-life (age 42) by men and women born in 1958 and 1970, using two well known population based UK birth cohorts (NCDS and BCS70). Our aim was to empirically test whether psychological distress has increased, and if so whether this increase can be explained by differences between the cohorts in their childhood conditions (including birth and parental characteristics), as well as differences in their social and emotional adjustment during adolescence. The measurement equivalence of psychological distress between the two cohorts was formally established using methods within the generalised latent variable modelling framework. The potential role of childhood conditions, social and behavioural adjustment in explaining between cohort differences was investigated with modern causal mediation methods. Differences with respect to psychological distress between the NCDS and BCS70 cohorts at age 42 were observed, with the BCS70 being on average more psychologically distressed. These differences were more pronounced in men, with the magnitude of the effect being twice as strong compared to women. For both men and women it appears this effect is not due to the hypothesised factors in early life and adolescence, since these accounted for only 15% of the between cohort difference in men and 20% in women.

LTruquetFriday 31 October 2014, 2pm - 3pm, Room COL 6.15, Columbia House (sixth floor)
Maps and directions|

Lionel Truquet
Université de Rennes|

Title: Statistical inference in semiparametric locally stationary ARCH models

Abstract:  In this work, we consider semiparametric versions of the univariate time-varying ARCH(p) model introduced by Dahlhaus & Subba Rao (2006) and studied by Fryzlewicz, Sapatinas and Subba Rao (2008). For a  given nonstationary data set, a natural question is to determine which coefficients capture the nonstationarity  and then which coefficients can be assumed to be non time-varying. For example, when the intercept is the  single time-varying coefficient, the resulting model is close to a multiplicative volatility model in the sense  of Engle & Rangel (2008) or Hafner and Linton (2010). Using kernel estimation, we will first explain how  to estimate the parametric and the nonparametric component of the volatility and how to obtain an asymptotically  efficient estimator of the parametric part when the noise is Gaussian. The problem of testing whether  some coefficients are constant or not is also addressed. In particular, our procedure can be used to test the  existence of a second-order dynamic in this nonstationary framework. Our methodology can be adapted to  more general linear regression models with time-varying coefficients, in the spirit of Zhang & Wu (2012).

[1] Dahlhaus, R., Rao, S.S. Statistical inference for time-varying ARCH processes. The Annals of Statistics, 2006, Vol. 34, No. 3, 1075 - 1114.
[2] Engle, R. F., Rangel, J. G. The spline-GARCH model for low-frequency volatility and its global macroeconomic causes. Rev. Financ. Stud. (2008) 21 (3).
[3] Fryzlewicz, P., Sapatinas, T., Subba Rao S. Normalized least-squares estimation in time-varying ARCH models. The Annals of Statistics (2008), Vol. 36, No. 2, 742-786.
[4] Hafner, C. M., Linton, O. Efficient estimation of a multivariate multiplicative volatility model. Journal of Econometrics (2010), Vol. 159, Issue 1, 55-73.
[5] Zhang, T., Wu, W.B. Inference of time-varying regression models. The Annals of Statistics (2012), Vol.40, No. 3, 1376-1402.

PNultyFriday 14 November 2014, 2pm - 3pm, Room COL 6.15, Columbia House (sixth floor)
Maps and directions|

Paul Nulty
LSE (Department of Methodology)|

Title: Tools and Methods for Quantitative Text Analysis

Abstract: In this talk I present an overview of methods used for quantitative analysis of large text corpora. I begin by describing practical issues involved in using software to retrieve information from large text files, online text, and social media text streams. I discuss how text is transformed for quantitative analysis by extracting a word frequency matrix or other relevant features for machine learning, and describe software in development on the QUANTESS project to facilitate this process. Finally, I will discuss the statistical properties of natural language text, and present ongoing research on improving methods for extracting features from text for use with standard machine learning algorithms, with application to the scaling of political texts

Please also see the Big Data Initiative Seminar Series| page

YFengFriday 28 November 2014, 2pm - 3pm, Room COL 6.15, Columbia House (sixth floor)
Maps and directions|

Yang Feng
Columbia University|


Title: Model Selection in High-Dimensional Misspecified Models

Abstract: Model selection is indispensable to high-dimensional sparse modeling in selecting the best set of covariates among a sequence of candidate models. Most existing work assumes implicitly that the model is correctly specified or of fixed dimensions. Yet model misspecification and high dimensionality are common in real applications. In this paper, we investigate two classical Kullback-Leibler divergence and Bayesian principles of model selection in the setting of high-dimensional misspecified models. Asymptotic expansions of these principles reveal that the effect of model misspecification is crucial and should be taken into account, leading to the generalized AIC and generalized BIC in high dimensions. With a natural choice of prior probabilities, we suggest the generalized BIC with prior probability which involves a logarithmic factor of the dimensionality in penalizing model complexity. We further establish the consistency of the covariance contrast matrix estimator in a general setting. Our results and new method are supported by numerical studies.

LPMAFriday 12 December 2014, 2pm - 3pm, Room COL 6.15, Columbia House (sixth floor)
Maps and directions|

Ismaël Castillo
Laboratoire de Probabilités et Modèles Aléatoires, Universities Paris VI and VII|

Title: Multiscale Bayes in density estimation

Abstract:  We present a nonparametric Bayesian analysis of the density estimation model with i.i.d. data on the unit interval. More specifically, using a multiscale approach, we derive results on convergence rates for the posterior distribution as well as limit theorems for functionals of the density, for certain families of prior distributions. We consider a few examples of such families, such as renormalized Gaussian processes and Polya tree priors.

Bernard SilvermanFriday 16 January 2015 2pm-3pm, Room COL 6.15 Columbia House (sixth floor)
Map and Directions|

Bernard Silverman
University of Oxford|

Title: Science and mathematics in the Home Office

Abstract: I will describe my role and work as Chief Scientific Adviser in the Home Office, and describe a range of examples where mathematics and science have a demonstrable impact on policy, with a focus on areas where statistical thinking and expertise has been useful.   In Forensic Science alone, these range from Protection of Freedoms legislation about the retention of DNA profiles to an evaluation of the risks of new DNA profiling protocols.   My major illustrative example, however, will be the novel use of multiple systems estimation to gain insight into the scale of Modern Slavery in the UK, the way this has fed into the Government's Modern Slavery Strategy, and the wider science/policy issues this work presented.

SedinovicFriday 6 February 2015 2pm - 3pm Room COL 6.15 Columbia House (sixth floor)
Map and Directions
Dino Sejdinovic
University of Oxford|

Title: Hypothesis testing with Kernel embeddings on big and interdependent data

Abstract: Embeddings of probability distributions into a reproducing kernel Hilbert space provide a flexible framework for non-parametric hypothesis tests, including two-sample, independence, and three-variable (Lancaster) interaction tests. In practice, two main limitations of this methodology are that it generally requires time (at least) quadratic in the number of observations and that the test correctness heavily relies on observations being independent. We overview how these tests can be scaled up to large datasets using mini-batch procedures, resulting in consistent tests suited to data streams or to situations when the observations cannot be stored in memory. Kernel selection can also be performed on-the-fly in order to maximize the asymptotic efficiency of these tests. Furthermore, we show consistency of a wild bootstrap procedure for kernel-based tests on random processes, and demonstrate its use in the study of dependence between time series across multiple time lags.

PanagiotisFriday 20 February 12pm - 1pm Room COL 6.15 Columbia House (sixth floor)
Map and Directions|

Panagiotis Merkouris
Athens University of Economics and Business|

Title: On best linear unbiased estimation and calibration in survey sampling

Abstract: A unified theory of optimal composite estimation in survey sampling settings involving combination of independent or correlated estimates from various survey sources can be formulated using the principle of best linear unbiased estimation. This applies to traditional survey designs involving data combination, such as multiple-frame and multi-phase sampling, and to various forms of combining data from independent or dependent samples with overlapping survey content, as in split-questionnaire designs, rotating panel surveys, non-nested double sampling and supplement surveys. An equivalent practical formulation of optimal composite estimation involving micro-integration of data from different samples is possible through a suitable calibration scheme for the sampling weights of the combined sample. The calibrated weights can be used to calculate weighted statistics, including totals, means, ratios, quantiles and regression coefficients. In particular, they give rise to composite estimators of population totals that are asymptotically best linear unbiased estimators. This unified approach to constructing optimal composite estimators through calibration will be illustrated with three distinct survey paradigms.

(Sandwiches and refreshments will be available in CLM 3.02, Clement House, at 1pm after the conclusion of this seminar)

DJHandFriday 20 February 2015 2pm - 3pm Room CLM 3.02 Clement House (third floor)
(Sandwiches and refreshments available at 1pm)
Map and directions|

David Hand
Imperial College London|

Title: From Big Data to Beyond Data: Extracting the Truth

Abstract: We are inundated with messages about the promise offered by big data. Economic miracles, scientific breakthroughs, technological leaps appear to be merely a matter of taking advantage of a resource which is increasingly widely available. But is everything as straightforward as these promises seem to imply? I look at the history of big data, distinguish between different kinds of big data, and explore whether we really are at the start of a revolution. No new technology is achieved without effort and without overcoming obstacles, and I describe some such obstacles that lie in the path of realising the promise of big data.

BellioFriday 6 March 2015 2pm - 3pm Room COL 6.15, Columbia House (sixth floor)
Map and Directions|

Ruggero Bellio
University of Udine|

Title: Likelihood-based inference with many nuisance parameters: Some recent developments

Abstract: We review frequentist inference on parameters of interest in models with many nuisance parameters, suitable for data with a stratified structure. In particular, two different likelihood-based methods  are illustrated. The first method is the  modified profile likelihood, where the nuisance parameters are removed through maximization.
The second method is  the integrated likelihood, where the nuisance parameters  are eliminated through integration, using a suitable weight function. The application  to some special settings is considered in some detail.
In particular, the focus is on  fixed-effects panel data models, small-sample  meta analysis, and item response theory models.

kolaczykFriday 13 March 2015 2pm - 3pm Room COL 6.15, Columbia House (sixth floor)
Map and Directions|

Eric Kolaczyk
Boston University|

Title: to be confirmed

Abstract: to be confirmed

SofiaOlhedeFriday 20 March 2015, 2pm - 3pm, Room COL 6.15, Columbia House (sixth floor)
Maps and directions|

Sofia Olhede
University College London|

Title: to be confirmed

Abstract: to be confirmed