Complex Systems in Time Series

4 and 5 December 2015

 

Conference organisers:

 

Clifford Lam 

Time Series group

Email: c.lam2@lse.ac.uk

Matteo Barigozzi

Time Series group

Email: m.barigozzi@lse.ac.uk

 

Administration support and general enquiries:

 

Ian Marshall

Research Administrator

Email: i.marshall@lse.ac.uk

Tel: +44 (0)20 7955 751

Titles and abstracts

jAston-tileJohn Aston

 

 

Title: Change point detection in neuroimaging functional time series

Abstract: Neuroimaging time series, such as those which arise from functional magnetic resonance imaging (fMRI), are both high dimensional and contain complex dependencies in both time and space. In particular, in resting state fMRI, the analyses of these dependencies are used to try to infer neural connections. However, it can be fairly easily shown that if there are change points present in the time series, either in mean or covariance structure, then these can very much distort conclusions if they are not taken into account. In this presentation we will examine tests for stationarity of both the mean and the covariance of fMRI time series where each brain scan is treated as a three dimensional function. We will describe various methods to make the problem computationally tractable, and then find both theoretical and empirical methods to determine whether change points are present. These will be used to then make further insights into the dynamic structure of brain connectivity.

Joint work with Andrew Davison, Claudia Kirch, Davide Pigoli, Christina Stoehr, Shahin Tavakoli, Wenda Zhou.


eKolaczyk-tileEric D Kolaczyk

Title: Dynamic causal networks with multi-scale temporal structure

Abstract: I will discuss a novel method to model multivariate time series using dynamic causal networks. This method combines traditional multi-scale modeling and network based neighbourhood selection, aiming at capturing the temporally local structure of the data while maintaining the sparsity of the potential interactions. Our multi-scale framework is based on recursive dyadic partitioning, which recursively partitions the temporal axis into finer intervals and allows us to detect local network structural changes at varying temporal resolutions. The dynamic neighbourhood selection is achieved through penalized likelihood estimation, where the penalty seeks to limit the number of neighbours used to model the data. Theoretical and numerical results describing the performance of our method will be presented, as well as applications in financial economics and/or neuroscience.

Joint work with Xinyu Kang and Apratim Ganguly.


mHallin-tileMarc Hallin

Title: Rank-based estimation for dynamic location-scale models

Abstract: Dynamic location-scale processes are essential tools in time series econometrics, and have motivated the study of increasingly diverse and sophisticated classes of continuous- and discretetime models such as ARCH, AR-ARCH or AR-LARCH models, AR conditional duration models, or discretely observed diffusions with jumps. Those models all involve some unobserved driving white noise, the distribution of which is often specified to be Gaussian, although Gaussian assumptions, quite admittedly, are unrealistic in most applications. An all too common belief, however, is that violations of Gaussian assumptions are essentially harmless, and that Gaussian Quasi-Likelihood (QL) methods remain safe and valid as soon as finite second- or fourth-order moments exist. As a consequence, Gaussian QL estimators remain the dominant daily practice. The pitfalls of Gaussian QL estimators, however, are well documented: see Engel and Gonz´alez- Rivera (1991), Linton (1993), Drost and Klaassen (1997), Hall and Yao (2003), Drost and Werker (2004), Francq and Zako¨ıan (2010), Fan et al. (2014) ... A semiparametric approach, along the standard lines of Bickel, Klaassen, Ritov, and Wellner looks like the right thing to do. However, we show that such an approach also raises problems, among which the difficulty of choosing the “right” semiparametric extension of the Gaussian model. We therefore propose rank-based estimation (R-estimators) as a substitute for problematic standard semiparametric methods. We show how to construct easy-to-implement R-estimators based on data-driven scores, preserving root-n consistency, irrespective of the actual density, for a variety of linear and nonlinear processes. Contrary to the standard semiparametric estimators, our R-estimators neither require tedious tangent space calculations nor computationally heavy innovation density estimation. Numerical examples and an empirical analysis of the log-return and log-transformed two-scale realized volatility of the USD/CHF exchange rate illustrate the good performances of the proposed estimators.

Based on a joint work with Davide La Vecchi.


wHaerdle-tile

Wolfgang K Härdle

Title: Distillation of news flow into analysis of stock reactions

Abstract: News carries information of market moves. The gargantuan plethora of opinions, facts and tweets on financial business offers the opportunity to test and analyse the influence of such text sources on future directions of stocks. It also creates though the necessity to distil via statistical technology the informative elements of this prodigious and indeed colossal data source. Using mixed text sources from professional platforms, blog fora and stock message boards we distil via different lexica sentiment variables. These are employed for an analysis of stock reactions: volatility, volume and returns. An increased (negative) sentiment will influence volatility as well as volume. This influence is contingent on the lexical projection and different across GICS sectors. Based on review articles on 100 S&P 500 constituents for the period of October 20, 2009 to October 13, 2014 we project into BL, MPQA, LM lexica and use the distilled sentiment variables to forecast individual stock indicators in a panel context. Exploiting different lexical projections, and using different stock reaction indicators we aim at answering the following research questions:

(i) Are the lexica consistent in their analytic ability to produce stock reaction indicators, including volatility, detrended log trading volume and return?

(ii) To which degree is there an asymmetric response given the sentiment scales (positive vs. negative)?

(iii) Are the news of high attention firms diffusing faster and result in more timely and efficient stock reaction?

(iv) Is there a sector specific reaction from the distilled sentiment measures?

We find there is significant incremental information in the distilled news flow. The three lexica though are not consistent in their analytic ability. Based on confidence bands an asymmetric, attention-specific and sector-specific response of stock reactions is diagnosed.

Joint work with Junni L Zhang, Cathy Y Chen and Elisabeth Bommes.


oLedoit-tileOlivier Ledoit

Title: Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz meets Goldilocks

Abstract: Markowitz (1952) portfolio selection requires estimates of (i) the vector of expected returns and (ii) the covariance matrix of returns. Many successful proposals to address the first estimation problem exist by now. This paper addresses the second estimation problem. We promote a  nonlinear shrinkage estimator of the covariance matrix that is more  flexible than previous linear shrinkage estimators and has ‘just the right number’ of free parameters to estimate (that is, the Goldilocks principle). It turns out that this number is the same as the number of assets in the investment universe. Under certain  high-level assumptions, we show that our nonlinear shrinkage estimator is asymptotically optimal for portfolio selection in the setting where the number of assets is of the same magnitude as the sample size. For example, this is the relevant setting for mutual fund managers who invest in a large universe of stocks. In addition to theoretical analysis, we study the real-life performance of our new estimator using backtest exercises on historical stock return data. We find that it performs better than previous proposals for portfolio selection from the literature and, in particular, that it dominates linear shrinkage.


aOnatski-tileAlexei Onatski

Title: Testing in high-dimensional spiked models

Abstract: We consider five different classes of multivariate statistical problems identified by James (1964). Each of these problems is related to the eigenvalues of $E^{-1}H$ where $H$ and $E$ are proportional to high-dimensional Wishart matrices. Under the null hypothesis, both Wisharts are central with identity covariance. Under the alternative, the non-centrality or the covariance parameter of H has a single eigenvalue, a spike, that stands alone. When the spike is larger than a case-specific phase transition threshold, one of the eigenvalues of $E^{-1}H$ separates from the bulk. This makes the alternative easily detectable, so that reasonable statistical tests have asymptotic power one. In contrast, when the spike is sub-critical, that is lies below the threshold, none of the eigenvalues separates from the bulk, which makes the testing problem more interesting from the statistical perspective. In such cases, we show that the log likelihood ratio processes parameterized by the value of the sub-critical spike converge to Gaussian processes with logarithmic correlation. We use this result to derive the asymptotic power envelopes for tests for the presence of a spike in the data representing each of the five cases in James' classification.


rvSachs-tileRainer von Sachs

Title: Functional mixed effect models for spectra of subject-replicated time series

Abstract: In this work in progress we treat a functional mixed effects model in the setting of spectral analysis of subject-replicated time series data. We assume that the time series subjects share a common population spectral curve (functional fixed effect), additional to some random subject-specific deviation around this curve (functional random effects), which models the variability within the population. In contrast to existing work we allow this variability to be non-diagonal, i.e. there may exist explicit correlation between the different subjects in the population.

To estimate the common population curve we project the subject-curves onto an appropriate orthonormal basis (such as a wavelet basis) and continue working in the coefficient domain instead of the functional domain. In a sampled data model, with discretely observed noisy subject-curves, the model in the coefficient domain reduces to a finite-dimensional linear mixed model. This allows us, for estimation and prediction of the fixed and random effect coefficients, to apply both traditional linear mixed model methods
and, if necessary by the spatially variable nature of the spectral curves, work with some appropriate non-linear thresholding approach.

We derive some theoretical properties of our methodology highlighting the influence of the correlation in the subject population. To illustrate the proposed functional mixed model, we show some examples using simulated time series data, and an analysis of empirical subject-replicated EEG data.

We conclude with some possible extensions, among which we allow situations where the data show potential breakpoints in its second order (spectral) structure over time.

The presented work is joint with Joris Chau (ISBA, UCL).


rSamworth-tileRichard Samworth

Title: Statistical and computational trade-offs in estimation of sparse principal components

Abstract: In recent years, Sparse Principal Component Analysis has emerged as an extremely popular dimension reduction technique for high-dimensional data. The theoretical challenge, in the simplest case, is to estimate the leading eigenvector of a population covariance matrix under the assumption that this eigenvector is sparse. An impressive range of estimators have been proposed; some of these are fast to compute, while others are known to achieve the minimax optimal rate over certain Gaussian or subgaussian classes. We show that, under a widely-believed assumption from computational complexity theory, there is a fundamental trade-off between statistical and computational performance in this problem. More precisely, working with new, larger classes satisfying a Restricted Covariance Concentration condition, we show that there is an effective sample size regime in which no randomised polynomial time algorithm can achieve the minimax optimal rate. We also study the theoretical performance of a (polynomial time) variant of the well-known semidefinite relaxation estimator, revealing a subtle interplay between statistical and computational efficiency.


pWolfe-tilePatrick Wolfe

 

 

Title: Aspects of time-varying network and modelling

Abstract: In recent years much progress has been made in modelling static networks.  Dynamic networks, in which linkages and connections evolve over time, represent a new set of challenges for statistical modelling and inference.  In this talk I will describe some key aspects of time-varying network modelling, as well as some specific instances in which progress has been made. 

Joint work with Sofia Olhede and colleagues.


photo-yaoJeff Yao

Title: Identifying the number of factors from singular values of a large sample auto-covariance matrix

Abstract: Identifying the number of factors in a high-dimensional factor model has attracted much attention in recent years and a general solution to the problem is still lacking. A promising ratio estimator based on the singular values of the lagged autocovariance matrix has been recently proposed in the literature and is shown to have a good performance under some specific assumption on the strength of the factors. Inspired by this ratio estimator and as a first main contribution, this paper proposes a  complete theory of such sample singular values for both the factor part and the noise part under the large-dimensional scheme where the dimension and the sample size proportionally grow to infinity. In particular, we  provide  the exact description of the phase transition phenomenon that determines whether a factor is strong enough to be detected with the observed sample singular values. Based on these findings and as a second main contribution of the paper, we propose a new estimator of the number of factors which is strongly consistent for the detection of  all significant factors (which are the only theoretically detectable ones). In particular, factors are assumed to have  the minimum strength above the phase transition boundary which is of the order of a constant; they are thus not required to grow to infinity together with the dimension (as assumed in most of the existing papers on high-dimensional factor models). Empirical Monte-Carlo study as well as the analysis of stock returns  data attest a very good performance of the proposed estimator. In all the tested cases, the new estimator largely outperforms the existing estimator using the same ratios of singular values.

This is a joint work with Zeng Li and Qinwen Wang.


   

Share:Facebook|Twitter|LinkedIn|