timeseries1

Time Series and Statistical Learning

Statistics is all about getting data and analysing it and using it to answer questions about the world be that in terms of Economics, Finance or public opinions. The applications are numerous

The LSE has a long and distinguished history in time series analysis and the Department of Statistics has a developing interest in various aspects of statistical learning. Research in time series concerned with the development of statistical methodologies for modelling, estimation, interpretation and forecasting of time series data. Complex time series data comes in many forms and sources. Examples include low and high frequency financial or economics time series, temperature/rainfall records as functions of time (curves), social media data as a dynamic network of users that changes its structure with time. With the advance of computing power and the complexity of data, both in volume and format, our research group has developed more focus in statistical learning techniques that can help visualise and interpret these data. 

The group has a strong link with the Econometrics group in the Department of Economics, which includes several eminent time series analysts. Our members have frequent collaborations with scientists around the world, exploring various problems in fields including, but not limited to, finance, economics, political science, physical science and health science. The members in the group provide consultancy service on time series and statistical learning related projects upon request. External consultancies include EDF (since 2010), Winton Capital Management (2011), Barclays Bank (since 2012), BBC (2012), BrandScience (2012), John Street Capital (since 2012), GfK (2013) and Bonamy Finch (----).

Research areas

Our research group is keenly involved in both theoretical developments and practical applications of various time series and statistical learning problems. 

Time series data can be complex in nature: high dimensional with huge volume, non-stationary, as functions, as networks, partial and coming from multiple sources at different times, and can be noisy. Our research group develops statistical methods in high-dimensional inference and dimension reduction for various kinds of data, including panel and tensor time series data, inferential analysis on dynamic networks and spatio-temporal processes, functional data analysis, functional time series analysis, shape-constrained estimation, change-point detection and multiscale modelling and estimation for high dimensional non-stationary time series. 

Demand for analysing complex data on an ever-increasing scale is a part of the real challenges underneath the buzzword Big Data. Efficient computational methods and estimations with statistical guarantees for massive data all play key roles in contributing to these challenges. We engage in finding causality among huge number of variables, developing new tools in reducing the dimension and/or the complexity of data by exploring latent low dimensional structures, investigating reinforcement learning and data-analytic aspects of modern machine learning methods in various practical problems,  proposing computationally efficient procedures for high dimensional problems while understanding the potential statistical limitations imposed by computational constraints, utilising Markov Chain Monte Carlo and sequential Monte Carlo techniques to facilitate inference tasks and causal inference via the interrupted time series design, and the use of interpretable machine learning methods such as factor analysis, mixture models, Gaussian processes and sequential methods. 

Applications

The outputs from our research group find applications in neuroscience, astronomy, finance economics, economic history, social science, medical science, to name but a few areas. Particular applications include, but not limited to: covariance and graphical models to characterise functional connectivity for different types of neuroimaging datasets, modelling and predicting annual age-specific mortality rates for different prefectures, recognising latent political affiliation through voting pattern, spatial econometric modelling for financial markets, modelling dynamic predator-prey interaction of animal populations, modelling and forecasting daily electricity load curves, quantifying and predicting counterparty credit risk in financial markets, applying reinforcement learning algorithms in improving patients’ health status and increasing revenue and customer satisfaction for ride-sharing companies, unemployment rate and oil price modelling, social network analysis, machine learning assisted material discovery and medical dataset analysis, infectious diseases modelling, exploring the use of stochastic epidemic models on Covid-19, Influenza, HIV and sheeppox. 

Selected publications

Lam, Clifford and Feng, Phoenix (2018) A nonparametric eigenvalue-regularized integrated covariance matrix estimator for asset return data. Journal of Econometrics, 206 (1). pp. 226-257. ISSN 0304-4076 http://eprints.lse.ac.uk/88375/

Lam, Clifford and Souza, Pedro C.L. (2019) Estimation and selection of spatial weight matrix in a spatial lag model. Journal of Business and Economic Statistics. ISSN 0735-0015. http://eprints.lse.ac.uk/91501/

Baranowski, Rafal, Chen, Yining and Fryzlewicz, Piotr (2019) Narrowest-over-threshold detection of multiple change points and change-point-like features. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 81 (3). 649 - 672. ISSN 1369-7412. http://eprints.lse.ac.uk/100430/

Fryzlewicz, Piotr (2018) Tail-greedy bottom-up data decompositions and fast mulitple change-point detection. Annals of Statistics, 46 (6B). pp. 3390-3421. ISSN 0090-5364. http://eprints.lse.ac.uk/85647/

Qiao, Xinghao, Guo, Shaojun and James, Gareth M. (2019) Functional graphical models. Journal of the American Statistical Association, 114 (525). 211 - 222. ISSN 0162-1459. http://eprints.lse.ac.uk/84856/

Guo, Shaojun and Qiao, Xinghao (2022) On consistency and sparsity for high-dimensional functional time series with application to autoregressions. Bernoulli. ISSN 1350-7265 (In Press). http://eprints.lse.ac.uk/114638/

Chang, J., Kolaczyk, E. D. and Yao, Q.  (2022).  Estimation of subgraph densities in noisy networks. Journal of the American Statistical Association, 117, 361-374. http://eprints.lse.ac.uk/104684/ 

Chang, J., Cheng, G. and Yao, Q. (2022). Testing for unit roots based on sample autocovariances. Biometrika, to appear. http://eprints.lse.ac.uk/114620/

Shi, Chengchun, Wang, Xiaoyu, Luo, Shikai, Zhu, Hongtu, Ye, Jieping and Song, Rui (2022) Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework. Journal of the American Statistical Association. 1 - 13. ISSN 0162-1459. http://eprints.lse.ac.uk/113310/

Shi, Chengchun, Wan, Runzhe, Chernozhukov, Victor and Song, Rui (2021) Deeply-debiased off-policy interval estimation. In: International Conference on Machine Learning, 2021-07-18 - 2021-07-24, Online. (In Press). http://eprints.lse.ac.uk/110920/

Chen, Yining (2020) Jump or kink: note on super-efficiency in segmented linear regression break-point estimation. Biometrika. ISSN 0006-3444. http://eprints.lse.ac.uk/103488/

Feng, Oliver Y., Chen, Yining, Han, Qiyang, Carroll, Raymond J and Samworth, Richard J. (2022) Nonparametric, tuning-free estimation of S-shaped functions. Journal of the Royal Statistical Society. Series B: Statistical Methodology. ISSN 1369-7412. http://eprints.lse.ac.uk/111889/

A. Beskos, J. Dureau and K. Kalogeropoulos (2015) Bayesian inference for partially observed stochastic differential equations driven by fractional Brownian motion.  http://eprints.lse.ac.uk/64806/ 

Dureau, J., Kalogeropoulos, K., Vickerman, P., Pickles, M. and Boily, M. C. (2016) A Bayesian approach to estimate changes in condom use from limited human immunodeficiency virus prevalence data. Journal of the Royal Statistical Society. Series C: Applied Statistics, 65 (2). 237 - 257. ISSN 0035-9254 http://eprints.lse.ac.uk/47602/

Chen, Y., Wang, T. and Samworth, R. J. (2021) High-dimensional, multiscale online changepoint detection. J. Roy. Statist. Soc., Ser. B., to appear. 

Chen, C. Y.-H., Okhrin, Y. and Wang, T. (2021) Monitoring network changes in social media. J. Bus. Econ. Statist., to appear

Academic and research staff

ChenY1200x200

Yining Chen - Assistant Professor

Change-point, shape constraint, computing, nonparametric, time series.

Staff page

Prof Piotr Fryzlewicz 200x200

Piotr Fryzlewicz - Professor

Time series, change-points, multiscale methods, causality, machine learning.

Staff page

Dr Kostas Kalogeropoulos 200x200

Kostas Kalogeropoulos - Associate Professor

Bayesian inference, latent stochastic processes, sequential learning, stochastic epidemic modelling, volatility estimation, bond risk premia.

Staff page

c-lam200x200

Clifford Lam - Professor

Financial time series, spatial econometrics, tensor time series, dimension reduction, factor modelling.

Staff page

Dr Xinghao Qiao200x200

Xinghao Qiao - Assistant Professor

Functional data analysis, functional time series, high-dimensional time series, high-dimensional statistical inference, large-scale multiple testing.

Staff page

Chengchun Shi

Chengchun Shi - Assistant Professor

Reinforcement learning, causal inference, statistical inference.

Staff page

 

Tengyao Wang 2021

Tengyao Wang - Associate Professor

High-dimensional data; change point analysis; sparse signal detection; robust statistics; dimension reduction.

Staff page

Prof Qiwei Yao 200x200

Qiwei Yao - Professor

High-dimensional time series analysis, dimension reduction and factor modelling, dynamic network, spatio-temporal processes, nonstationary processes and cointegration.

Staff page

Research students