Browser does not support script.

Statistics takes the numbers that you have and summarises them into fewer numbers which are easily digestible by the human brain

This seminar series is a joint partnership with the STICERD Econometrics programme.

All joint Statistics and Econometrics seminars during Lent Term 2018 will take place from 12.00pm to 1.00pm and will be preceded by refreshments from 11.45am. Unless otherwise specified, the seminars will take place in COL 6.15 (Leverhulme Library), 6th Floor of Columbia House.

Marc Hallin is a Professor at Université libre de Bruxelles. His talk details are below and have a look at his webpage.

**Title: **Optimal dimension reduction for vector and functional time series.

**Abstract: **Dimension reduction techniques are at the core of the statistical analysis of high-dimensional observations. Whether the data are vector- or function-valued, principal component techniques, in this context, play a central role. The success of principal components in the dimension reduction problem is explained by the fact that, for any K<=p, the K first coefficients in the expansion of a p-dimensional random vector X in terms of its principal components is providing the

best linear K-dimensional summary of X in the mean square sense. This

optimality feature, however, no longer holds true in a time series context: principal components, when the observations are serially dependent, are losing their optimal dimension reduction property to the so-called "dynamic principal components" introduced by Brillinger in 1981 in the vector case and, in the functional case, their functional extension proposed by Hormann, Kidzinski and Hallin (JRSS Ser.B, 2015). Principal components similarly are central tools in the

estimation of factor models: traditional principal components in the approach proposed by Stock and Watson (JASA 2002) or Bai and Ng (Econometrica 2002); dynamic ones for the Forni et al. (Review of Economics and Statistics 2000). The optimal dimension reduction properties of dynamic principal components explain why the latter, in general, are more parsimonious, and perform better, under less

restrictive assumptions.

Title and abstract TBC

Majid Al-Sadoon's webpage.

Daniel Pena is a Professor at Universidad Carlos III de Madrid. His talk details are below and have a look at his webpage.

**Title:** Forecasting Multiple Time Series with One-Sided Dynamic Principal Components

**Abstract: **We define one-sided dynamic principal components (ODPC) for time series as linear combinations of the present and past values of the series that minimize the reconstruction mean squared error. Previous definitions of dynamic principal components depend on past and future values of the series. For this reason, they are not appropriate for forecasting purposes. On the contrary, it is shown that the ODPC introduced in this paper can be successfully used for forecasting high-dimensional multiple time series. An alternating least squares algorithm to compute the proposed ODPC is presented. We prove that for stationary and ergodic time series the estimated values converge to their population analogues. We also prove that asymptotically, when both the number of series and the sample size go to infinity, if the data follows a dynamic factor model, the reconstruction obtained with ODPC converges, in mean squared error, to the common part of the factor model. Monte Carlo results shows that forecasts obtained by the ODPC compare favourably with other forecasting methods based on dynamic factor models.

Yingying Fan is Associate Professor at University of Southern California. Her talk details are below and have a look at her webpage.

**Title: **RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs**Abstract:** Power and reproducibility are key to enabling refined scientific discoveries in contemporary big data applications with general high-dimensional nonlinear models. In this paper, we providetheoretical foundations on the power and robustness for the model-free knockoffs procedure introduced recently in Cand`es, Fan, Janson and Lv (2016) in high-dimensional setting when the ovariatedistribution is characterized by Gaussian graphical model. We establish that under mild regularity conditions, the power of the oracle knockoffs procedure with known covariate distribution inhigh-dimensional linear models is asymptotically one as sample size goes to infinity. When moving away from the ideal case, we suggest the modified model-free knockoffs method called graphical nonlinearknockoffs (RANK) to accommodate the unknown covariate distribution. We provide theoretical justifications on the robustness of our modified procedure by showing that the false discovery rate (FDR) isasymptotically controlled at the target level and the power is asymptotically one with the estimated covariate distribution. To the best of our knowledge, this is the first formal theoretical result onthe power for the knock- offs procedure. Simulation results demonstrate that compared to existing approaches, our method performs competitively in both FDR control and power. A real data set isanalyzed to further assess the performance of the suggested knockoffs procedure.

Carsten Jentsch is a Professor at Universität Mannheim. His talk details are below and take a look at his webpage.

**Title:** Statistical inference on party positions from texts: statistical modeling, bootstrap and adjusting for time effects

**Abstract:** One central task in comparative politics is to locate party positions in a certain political space. For this purpose, several empirical methods have been proposed using text as data sources. In general, theanalysis of texts to extract information is a difficult task. Its data structure is very complex and political texts usually contain a large number of words such that a simultaneous analysis of word countsbecomes challenging. In this paper, we consider Poisson models for each word count simultaneously and provide a statistical analysis suitable for political text data. In particular, we allow formulti-dimensional party positions and develop a data-driven way of determining the dimension of positions. Allowing for multi-dimensional political positions gives new insights in the evolution of partypositions and helps our understanding of a political system. Additionally, we consider a novel model which allows the political lexicon to change over time and develop an estimation procedure basedon LASSO and fused LASSO penalization techniques to address high-dimensionality via significant dimension reduction. The latter model extension gives more insights into the potentially changing useof words by left and right-wing parties over time. Furthermore, the procedure is capable to identify automatically words having a discriminating effect between party positions. To address thepotential dependence structure of the word counts over time, we included integer-valued time series processes into our modeling approach and we implemented a suitable bootstrap method to constructconfidence intervals for the model parameters. We apply our approach to party manifesto data from German parties over all seven federal elections after German reunification. The approach is simplyimplemented as it does not require any a priori information (from external source) nor expert knowledge to process the data. The data studies confirm that our procedure is robust, runs stable and leads tomeaningful and interpretable results.

***Please note that this seminar takes place in CLM.3.02, 3rd floor of Clement House instead of the Leverhulme Library***

During Michaelmas term, they take place on a Friday at 12-1pm in 32L.LG.03 or (Lower Ground Floor, LSE, 32 Lincoln's Inn Fields, London, WC2A 3PH) unless otherwise stated.

Please have a look at the STICERD website for details on the past seminars of MT 2017.

9th December 2016 - 32L.LG.03

**Speaker** - Javier Hidalgo (LSE)

**Title** - TBC

2nd December 2016 - 32L.LG.03

**Speaker** - Peter Robinson (LSE)

**Title** - Inference on trending panel data

25th November 2016 - 32L.LG.03

**Speaker** - Yungyoon Lee (Royal Holloway, University of London)

**Title** - Misspecification testing in spatial autoregressive models

18th November 2016 - 32L.LG.03

**Speaker** - Dongwoo Kim (UCL)

**Title** - Nonseparable unobserved hetereogeneity and partial identificaion in IV models for count outcomes

11th November 2016 - 32L.LG.03

**Speaker** - Namhyun Kim (Exeter University)

10th November 2016 - 32L.LG.03

**Speaker** - Patrick Wongsa-Art (Newcastle University)

**Title** - TBC

4th November 2016 - 32L.LG.03

**Speaker** - Matt Masden (Duke University), joint with Alexandre Poirier

**Title** - Partial independence in nonseparable models

Download the paper

7th October 2016 - 32L.LG.03

**Speaker** - Emmanuel Guerre (QMW), joint with Nathalie Gimenes

**Title** - Quantile methods for first-price auctions: a signal approach

30th September 2016 - 32L.LG.03

**Speaker** - Marcelo Moreira (Fundação Getúlio Vargas (FGV/EPGE)), joint with Humberto Moreira.

Download the paper

**Past LT seminars in 2017**

24th March 2017 - 12-1pm in the Leverhulme Library COL.6.15

**Title** - The uncertainty of principal components in dynamic factor models

**Abstract** - Dynamic Factor Models (DFM) are often fitted to large systems of multivariate time series to represent the evolution of underlying factors. Given that these factors are usually unobserved, to correctly interpret their estimated counterparts, one needs a measure of their uncertainty. In the context of very large systems of economic and financial variables, it is popular to extract factors using the computationally easy although non-efficient Principal Components (PC) procedure.

The asymptotic distribution of factors extracted by PC is known. However, for the sample sizes and cross-sectional dimensions usually encountered in practice, the asymptotic distribution is not an appropriate approximation to the finite sample one. We propose using bootstrap procedures to approximate the finite sample distribution of the factors extracted by PC to have a realistic picture of their associated uncertainty.

The finite sample properties of the proposed procedure are analyzed and compared with those of the asymptotic distribution and alternative bootstrap procedures previously proposed in the context of DFM. The results are empirically illustrated obtaining confidence intervals of the underlying factor in a system of Spanish macroeconomic variables and in a system of in house process of advanced and emerging markets. Joint work with Javier de Vicente.

10th March 2017 - 12-1pm in the Leverhulme Library COL.6.15

**Title** - Sequential testing for structural stability in approximate factor models

**Abstract** - We develop a a family of monitoring procedures to detect a change in a large factor model. Our statistics are based on the following property of the (r+1)-th eigenvalue of the sample covariance matrix of the data: whilst under the null the (r+1)-th eigenvalue is bounded, under the alternative of a change (either in the loadings, or in the number of factors itself) it becomes spiked. Given that the sample eigenvalue does not have a known limiting distribution under the null, we regularise the problem by randomising the test statistic in conjunction with sample conditioning, obtaining a sequence of i.i.d., asymptotically chi-squared statistics which are then employed to build the monitoring scheme. Numerical evidence shows that our procedure works very well in finite samples, with a very small probability of false detections and tight detection times in presence of a genuine change point. Joint with Matteo Barigozzi.

24th February 2017 - 12-1pm in the Leverhulme Library COL.6.15

**Title** - Detection of periodicity in functional time series

**Abstract -** Periodicity is one of the most important characteristics of time series, and tests for periodicity go back to the very origins of the eld. The importance of such tests has manifold reason. One of them is that most inferential pro-cedures require that the series be stationary, but classical stationarity tests (as e.g. KPSS procedures) have little power against a periodic component inthe mean.

In this account we respond to the need to develop periodicity tests for functional time series (FTS). Examples of FTS's include annual temperature or smoothed precipitation curves, daily pollution level curves, various daily curves derived from high frequency asset price data, daily bond yield curves, daily vehicle trac curves and many others. One of the important contributions of this article is the development of a fully functional ANOVA test for stationary data. If the functional time series (Yt) satises a certain weak-dependence condition, then, using a fre- quency domain approach, we obtain the asymptotic null-distribution (for the constant mean hypothesis) of the functional ANOVA statistic.

The limiting distribution has an interesting form and can be written as a sum of independent hypoexponential variables whose parameters are eigenvalues of the spectral density operator of (Yt). To the best of our knowledge, there exists no comparable asymptotic result in FDA literature. Adapting ANOVA for dependence is one way to conduct periodicity analysis. It is suitable when the periodic component has no particular form. If, however, the alternative is more specic or the period is large then we can construct simpler and more powerful tests. We hence introduce three dif- ferent models with increasing complexity and develop the appropriate test statistics.

The power-advantage will be illustrated in simulations and by a theoretical case study where we consider local consistency results for three specic alternatives.A common approach to inference for functional data is to project obser- vations onto a low dimensional basis system and then to apply a suitable multivariate procedure to the vector of projections. This approach will also be explained and discussed.

The talk is based on joint work with Piotr Kokoszka (Colorado State University) and Gilles Nisol (ULB).

27th January 2017 - 12-1pm in the Leverhulme Library COL.6.15

**Title** - Testing uniformity on high-dimensional spheres against monotone rotationally symmetric alternatives

**Abstract** - We consider the problem of testing uniformity on high-dimensional unit spheres. We are primarily interested in non-null issues. We show that rotationally symmetric alternatives lead to two Local Asymptotic Normality (LAN) structures.

The first one is for fixed modal location θ and allows to derive locally asymptotically most powerful tests under specified θ. The second one, that addresses the Fisher–von Mises–Langevin (FvML) case, relates to the unspecified-θ problem and shows that the high-dimensional Rayleigh test is locally asymptotically most powerful invariant. Under mild assumptions, we derive the asymptotic non-null distribution of this test, which allows to extend away from the FvML case the asymptotic powers obtained there from Le Cam’s third lemma.

Throughout, we allow the dimension p to go to infinity in an arbitrary way as a function of the sample size n. Some of our results also strengthen the local optimality properties of the Rayleigh test in low dimensions. We perform a Monte Carlo study to illustrate our asymptotic results. Finally, we treat an application related to testing for sphericity in high dimensions.

Joint work with Christine Cutting and Thomas Verdebout.

13th January 2017 - 12-1pm in the Leverhulme Library COL.6.15

**Title** - Some recent progress on nonlinear spatial modelling: A personal review

**Abstract** - Larger amounts of spatial or spatiotemporal data with more complex structures collected at irregularly spaced sampling locations are prevalent in a wide range of disciplines. With few exceptions, however, practical statistical methods for nonlinear modeling and analysis of such data remain elusive. In this talk, I provide a review on some developments and progress of the research that my co-authors and I have recently done.

In particular, we will look at some nonparametric methods for probability, including joint, density estimation, and semiparametric models for a class of spatio-temporal nonlinear regression permitting possibly nonlinear relationship between response and covariates, with location-dependent spatial neighbouring and temporal lag effects taken account of. In the setting of semiparametric spatiotemporal modelling, a computationally feasible data-driven method is also shown for spatial weight matrix estimation. For illustration, our methodology is applied to investigate some land and housing prices data sets.

13th January 2017 - 12-1pm in the Leverhulme Library COL.6.15

**Title** - Bootstrap inference under random distributional limits

**Abstract** - Asymptotic bootstrap validity is usually understood and established as consistency of the distribution of a bootstrap statistic, conditional on the data, for the unconditional limit distribution of a statistic of interest. Cases where the limit measure induced by the bootstrap is random are therefore regarded as cases where bootstrap inference is invalid.

However, apart from possessing at most one unconditional limit distribution under a fixed asymptotic scheme, a statistic in general may possess a host of conditional (random) limit distributions, depending on the choice of the conditioning sets. We discuss the appropriate probabilistic tools for establishing asymptotic bootstrap validity, in terms of asymptotic distributional uniformity of bootstrap p-values, in the case where the distribution of the bootstrap statistic conditional on the data estimates consistently a conditional limit distribution of a statistic, in a sense weaker than the usual weak convergence in probability.

We provide two general sufficient conditions for bootstrap validity in cases where weak convergence in probability fails. Finally, we apply our result to tests of parameter constancy in a general regression model based, providing a rigorous analysis of the validity of inference based on the fixed regressor bootstrap.

Joint work with Iliyan Georgiev.

Browser does not support script.

Browser does not support script.

Browser does not support script.

Browser does not support script.

Browser does not support script.

Browser does not support script.