Research Showcase 2022

The speakers are ordered according to surname.

14 June

Abstract: We develop an equilibrium model for market impact of trades when investors with private signals execute via a trading desk. Fat tails in the distribution of the fundamental value lead to apower law forprice impact, while the impact is logarithmic for lighter tails. Moreover, the tail distribution of the equilibrium trade volume obeys a power law, consistent with numerous empirical studies. The spread decreases with the degree of noise trading and increases with the number of insiders. However, competition among insiders leads to aggressive trading, hence vanishing profit in the limit. The model also predicts that the order book flattens as the amount of noise trading increases converging to a model with proportional transaction costs with non-vanishing spread.

Take a look at Umut's slides (PDF).

Abstract: This talk will introduce the problem of parallel sequential change detection, which receives wide real-world applications in education, marketing, and cloud computing, among many others. This problem concerns detecting change points in parallel data streams, where each stream has its own change point,at which its data has a distributional change. With sequentially observed data, a decision-maker needs to declare whether changes have already occurred to the streams at each time point. Once a stream is declared to have changed, the decision-maker will intervene, such as deactivating the stream so that its future data will no longer be collected. We argue that for many applications, it is more sensible to optimise certain compound performance metrics that aggregate over all the streams. Consequently, the decisions for different streams become dependent. We propose a general compound decision framework for parallel sequential change detection, under which different performance metrics are given. In addition, data-driven decision procedures are developed, and optimality results are established for them. Some simulation results will be given to show the power of the proposed method.

Take a look at Yunxiao's slides (PDF).

Abstract: Dynamic term structure models offer standard tools for pricing and forecasting bond excess returns. A critical aspect of the process is to impose restrictions to tighten the link between cross-sectional and time-series variation of interest rates, and help resolve the puzzle of implausibly stable short-rate expectations. From a machine learning viewpoint the problem maybe cast as identifying the sparse signal of the market price of risk. We adopt a Bayesian approach to achieve sparsity utilising spike and slab priors to disentangle the signal from noise in bond risk premia. Our methodological framework successfully handles sequential model search by applying stochastic search variable selection (SSVS) over the restriction space landscape. At the same time, it can be linked with portfolio optimisation in real time, allowing investors to revise their beliefs when new information arrives. Empirical results reveal strong evidence of out-of-sample predictability in the case of sparse models that only allow level risk to be priced. Most importantly, such statistical evidence is turned into economically significant utility gains, across prediction horizons. Finally, the sequential version of the SSVS scheme, developed inthis work, offers an important diagnostic allowing to monitor potential changes over time spanning periods of macroeconomic uncertainty monetary policy changes.

Take a look at Kostas's slides (PDF).

Abstract: We consider latent variable models for the joint distribution of variables within a dyad of two interacting units, where the variables of interest are measured by multiple binary indicators. They are applied to analyse exchanges of practical and financial help between adult individuals and their non-coresident parents, using survey data from the UK Household Longitudinal Study. This is motivated by research questions in sociology and social policy about the levels of help given and on reciprocity between them. A particular focus here is on modelling the correlations between help given and received, and how they are associated with individual and family-level covariates. We implement the estimation of these models using a bespoke MCMC algorithm.

Take a look at Jouni's slides (PDF).

Abstract: A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. Major challenges arise in online experiments of two-sided marketplace platforms (e.g.,Uber) where there is only one unit that receives a sequence of treatments over time. In those experiments, the treatment at a given time impacts current outcome as well as future outcomes. In this talk, we introduce a reinforcement learning framework for carrying A/B testing in these experiments, while characterizing the long-term treatment effects. Our proposed testing procedure allows for sequential monitoring and online updating. It is generally applicable to a variety of treatment designs in different industries. In addition, we systematically investigate the theoretical properties of our testing procedure. Finally, we apply our framework to both simulated data and a real-world data example obtained from a ridesharing company to illustrate its advantage over the current practice.

Take a look at Chengchun's slides (PDF).

Abstract: We introduce a new method for estimating the location of sparse changes in high-dimensional linear regression coefficients, without assuming that those coefficients are individually sparse. The procedure works by constructing different sketches (projections) of the design matrix at each time point, where consecutive projection matrices differ in sign in exactly one column. The sequence of sketched design matrices is then compared against a single sketched response vector to form a sequence of test statistics whose behaviour shows a surprising link to the well-known CUSUM statistics of univariate changepoint analysis. Strong theoretical guarantees are derived for the estimation accuracy of the procedure, which is computationally attractive, and simulations confirm that our methods perform well in a broad class of settings.

Take a look at Tengyao's slides (PDF).

Abstract: The aim of this project is to design and train a neural network for earthquake wave detection at Grillo seismic sensors. The goal is to train a neural network for recognition of the P-wave onset. The neural network trained is able to: Distinguish signal and noise sections. For signals, identify the P-wave arrival time with the precision of +5 seconds and identify the P-wave arrival time within 3 seconds after the P-wave arrival.

Take a look at Qixuan and Xixiang's slides (PDF).

Abstract: The frontier of healthcare has been shifting away from a one-size-fits all approach to delivering the best care for each person. In the era of big medical data, advanced analytics and technology, we at Roche are committed to creating and delivering data-driven, and/or technology-enabled innovations across the patient care continuum: from early detection and diagnosis, to remote care and monitoring. In this talk, we will share a few examples of Roche innovations in this space, our challenges and ambitions, and potential R&D ideas that welcome future collaborations.

15 June

Abstract: Probabilistic methods for classifying text form a rich tradition in machine learning and natural language processing. For many important problems, however, class prediction is uninteresting because the class is known, and instead the focus shifts to estimating latent quantities related to the text, such as affect orideology. We focus on one such problem of interest, estimating the ideological positions of 55 Irish legislators in the 1991 Dáil confidence vote, a challenge brought by opposition party leaders against the then-governing Fianna Fáil party in response to corruption scandals. In this application, we clearly observe support or opposition from the known positions of party leaders, but have only information from speeches from which to estimate the relative degree of support from other legislators. To solve this scaling problem and others like it, we develop a text modeling framework that allows actors to take latent positions on a “gray” spectrum between “black” and “white” polar opposites. We are able to validate results from this model by measuring the influences exhibited by individual words, and we are able to quantify the uncertainty in the scaling estimates by using a sentence-level block bootstrap. Applying our method to the Dáil debate, we are able to scale the legislators between extreme pro-government and pro-opposition in a way that reveals nuances in their speeches not captured by their votes or party affiliations.

Take a look at Kenneth's slides (PDF).

Abstract: Additive models with interactions have been considered extensively in the literature, using estimation methods such as maximum likelihood, Tikhonov regularization or Gaussian process regression. We present an alternative empirical-Bayes approach to selecting interaction effects using the I-prior approach (WB, 2020). Using a parsimonious specification of hierarchical interaction spaces, model selection is simplified. Furthermore, we present an efficient EM algorithm for estimating the key hyperparameters, not available for competing approaches. The EM algorithm facilitates finding the global maximizer of the marginal likelihood.

Simulations for linear regressions indicate competitive performance with methods such as the lasso and Bayesian variable selection using spike and slab priors or g-priors. However, the proposed methodology is more general and can also be used with interacting nonlinear regression functions.

The regression functions live in hierarchical interaction spaces for which we consider reproducing kernel Krein spaces (RKKSs), which generalize the too restrictive reproducing kernel Hilbert spaces (RKHSs) by loosening the positive definiteness requirement. Regardless of this, the Fisher information for the regression function is positive definite, hence defining an RKHS; loosely speaking, the norm of a function in this RKHS measures the difficulty of estimating it. Then, the I-prior maximizes entropy subject to aconstant difficulty of estimating the regression function, leading to a proper Gaussian prior, ie its paths are a.s. in the required RKKS thereby obviating the need for a representer theorem.

Take a look at Wicher's slides (PDF).

Abstract: The first project is a collaborative award funded by the Wellcome Trust entitled Evaluating Policy Implementations TO Predict MEntal health (EPITOME): a Bayesian hierarchical framework for quasi-experimental designs in longitudinal settings. I am collaborating with colleagues Gianluca Baio and James Kirkbride from UCL and Marta Blangiardo from IC.

In this project the main aim is to use the interrupted time series design for causal inference to attempt to evaluate the effect of various austerity policies introduced by the Conservative/coallition goverments in the UK starting in 2010 on the mental health of minority communities in London. We also hope to be able to get a handle on the effect of the Hostile Environment Policy, introduced in 2012, which lead to the Windrush Scandal. We will be using novel Bayesian approaches to estimate synthetic and negative outcome controls amongst others. We have access to secondary referrals to mental health services in various trusts in London.

The second project is in collaboration with a former teaching fellow of the department, now working as quantitative criminologist at Leeds, Jose Pina-Sanchez. It is entitled Exploring the Nature of Ethnic Disparities in Sentencing through Causal Inference.

This is very much an applied project which aims at using DAG models, causal inference techniques and sensitivity analysis to explore how discrimination enters the judicial system. It is apparently a wide-spread belief in the UK judicial system that there is no discrimination, and we would like to show that the data we observe is in fact congruous with the presence of discrimination and that it cannot be ruled out. We will have access to individual level sentencing data--via secure access facilities--which the UK government are making available for the first time.

Take a look at Sara's slides (PDF).

Abstract: Regression with Gaussian Process (GP) prior is a powerful statistical tool for modelling a wide variety of data with both Gaussian and non-Gaussian likelihood. In the spatial statistics community, GP regression, also known as Kriging, has a long-standing history. It has been proven useful since its introduction, due to its capability of modelling autocorrelation of spatial and spatio-temporal data.

Other than space and time, real-life applications often contain additional information with different characteristics. In applied research, interests often lie in exploring whether there exists a space-time interaction or investigating relationships with covariates and the outcome while controlling for space and time effect.

Additive GP regression allows to model such flexible relationships by exploiting the structure of the GP covariance function (kernel) by adding and multiplying different kernels for different types of covariates.This has only partially be adapted in spatial and spatio-temporal analysis.

In this study, we use ANOVA decomposition of kernels and introduce a unified approach to model spatio-temporal data, using the full flexibility of additive GP models. Not only does this permit modelling of main effects and interactions of space and time, but furthermore to include covariates, and let the effects of the covariates vary with time and space. We consider various types of outcomes including, continuous, categorical and counts. By exploiting kernels for graphs and networks, we show that areal data canbe modelled in the same manner as the data that are geo-coded using coordinates.

Take a look at Sahoko's slides (PDF).

Abstract: Tensor time series data appears naturally in a lot of fields, including finance and economics. As a major dimension reduction tool, similar to its factor model counterpart, the idiosyncractic components of a tensor time series factor model can exhibit serial correlations, especially in financial and economic applications. This rules out a lot of state-of-the-art methods that assume white idiosyncractic components, or even independent/Gaussian data.

While the traditional higher order orthogonal iteration (HOOI) is proved to be convergent to a set of factor loading matrices, the closeness of them to the true underlying factor loading matrices are ingeneral not established, or only under some strict circumstances like having i.i.d. Gaussian noises (Zhang and Xia, 2018). Under the presence of serial and cross-correlations in the idiosyncractic components and time series variables with only bounded fourth order moments, we propose a pre-averaging method that accumulates information from tensor fibres for better estimating all the factor loading spaces. The estimated directions corresponding to the strongest factors are then used for projecting the data for a potentially improved re-estimation of the factor loading spaces themselves, with theoretical guarantees and rate of convergence spelt out. We also propose a new rank estimation method which utilizes correlation information from the projected data, in the same spirit as Fan et al (2020) for factor models with independent data. Extensive simulation results reveal competitive performance of our rank and factor loading estimators relative to other state-of-the-art or traditional alternatives. A set of matrix-valued portfolio return data is also analysed.

Take a look at Clifford's slides (PDF).

Abstract: In the talk, I will discuss composite likelihood estimation and testing for latent variable models for categorical data. Latent variable models are widely used in social sciences for measuring unobserved constructs such as ability, attitudes, health state, etc. The constructs are measured using a number of observed variables (items) that can be continuous, categorical or of mixed type. The computational complexity of the models depends on the number of observed variables and latent variables. Frequentist and Bayesian estimation methods have been proposed in the literature for estimating the model parameters. In addition, full and limited information methods have been developed within the frequentist approach. One of the limited information methods is known as composite likelihood and in particular in our case pairwise likelihood estimation. I will review the pairwise likelihood framework for the models for ordinal and binary data and discuss some recent advances.

Take a look at Irini's slides (PDF).

Abstract: After presenting myself, I will tell you about the research I have done during the PhD. In the first half of the talk, you can expect stories about models for sparse networks, underpinned by Bayesian nonparametric theory, and a discussion about their asymptotic properties. In the second half I will concentrate on statistical methods to assess disclosure risk, a method to quantify the level of privacy of a non-perturbed dataset. I will wrap up briefly mentioning some work on fair generalised linear models using a simple ridge regression.

Take a look at Francesca's slides (PDF).

Abstract: For many problems in statistics, finance, and insurance, there is a significant interest in understanding so-called invariance principles, which capture the approximate behavior of a given model under suitable scaling limits. Often, this leads to questions of functional convergence for stochastic integrals. In this talk, I will address how to achieve such convergence when the models of interest have certain autoregressive features that take us outside the realm of the classical literature.

Take a look at Andreas's slides (PDF).

Abstract: Style transfer is a central problem in data science with numerous successful applications. I will present a novel style transfer framework relying on vector-valued reproducing kernel Hilbert spaces. The idea is instantiated in emotion transfer where the goal is to transform facial images to different target emotions, with explicit control over the continuous style space. The proposed approach achieves low reconstruction cost and high emotion classification accuracy on various popular facial emotion benchmarks.

Take a look at Zoltan's slides (PDF).

Abstract: Recently, reinforcement learning (RL) has attracted substantial research interests. Much of the attention and success, however, has been for the discrete-time setting. Continuous-time RL, despite its natural analytical connection to stochastic controls, has been largely unexplored and with limited progress. In particular, characterising sample efficiency for continuous-time RL algorithms remains a challenging and open problem. In this talk, we will discuss some recent advances in the regret analysis for the episodic linear-convex RL problem, and report a sublinear (or even logarithmic) regret bound for a learning algorithm inspired by filtering theory. The approach is probabilistic, involving quantifying the precise performance gap between applying greedy policies derived from estimated and true models, and exploring the concentration properties of sub-Weibull random variables.

Take a look at Yufei's slides (PDF).

Student Poster Session

Take a look at our poster session PDFs below:

Xixiang Hu, Zhiyu Li, Qixuan Wu, Wen Xiang and Jiangchen Zhao - 'Rapid seismic detection by reformatting Phasenet'.

Xinyi Liu, Gabriel Wallin, Yunxiao Chen, and Irini Moustaki - 'Rotation To Sparse Loadings Using Lp Losses and Related Inference Problems'.

Yiliu Wang and Milan Vojnovic - 'Sketching stochastic valuation functions'.

Jialin Yi and Milan Vojnovic - 'Distributed Learning with Bandit and Delayed Feedbacks'.

Research Showcase 2022

Details of our 2022 presentations and
student posters

14 June

Umut Cetin - 'Power laws in market microstructure'

Yunxiao Chen - 'Compound Decision for Parallel Sequential Change Detection'

Kostas Kalogeropoulos - 'Sequential Learning of Bond Risk Premia'

Jouni Kuha - 'Modelling covariance matrices in multivariate dyadic data'

Chengchun Shi - 'A reinforcement learning framework for dynamic causal effects evaluation in A/B testing'

Tengyao Wang - 'Sparse change detection in high-dimensional linear regression'

Qixuan Wu & Xixiang Hu - 'Rapid seismic detection by reformatting Phasenet'

Yajing Zhu - 'Actionable innovations: from data science to personalised healthcare'

15 June

Kenneth Benoit - 'A Better Wordscores: Scaling Text with the Class Affinity Model'

Wicher Bergsma - 'Selecting interaction effects in additive models using I-priors'

Sara Geneletti - 'Exploring discrimination in mental health and sentencing using causal inference methods'

Sahoko Ishida - 'Additive Gaussian process models for spatial and spatio-temporal analysis'

Clifford Lam - 'Rank and Factor Loadings Estimation in Time Series Tensor Factor Model by Pre-averaging'

Irini Moustaki - 'Past and new developments in pairwise likehood estimation for latent variable models'

Francesca Panero - 'Stepping into my PhD research: network models, disclosure risk assessment and a bit of fairness'

Andreas Sojmark - 'Functional Weak Convergence of Stochastic Integrals'

Zoltan Szabo - 'Continuous Emotion Transfer'

Yufei Zhang - 'Some recent progress for continuous-time reinforcement learning'

Student Poster Session

Our Research > Take a look!