The DSI aims to host, facilitate and promote research in social and economic data science.
This research is initially focused under six themes, with DSI research being integral to realising the ambitions at the heart of LSE 2030. Through keeping the social sciences at the forefront of global thinking, solutions and impact, we will continue LSE's founding mission to know the causes of things for the betterment of society.
Recent DSI publications
- Pulkkinen, Undorf, Bender, Wikman-Svahn, Doblas-Reyes, Flynn, Hegerl, Jönsson, Leung, Roussos, Shepherd & Thompson (2022) The value of values in climate science. Nature Climate Change.
https://doi.org/10.1038/s41558-021-01238-9
This article describes how values shape choices at all levels from methodology to communication. The authors call for wider reflection on management of social values in climate science.
- Katzav, Thompson, Risbey, Stainforth, Bradley, and Frisch (2021). On the appropriate and inappropriate uses of probability distributions in climate projections, and some alternatives. Climatic Change 169 (15).
https://doi.org/10.1007/s10584-021-03267-x
This article argues that probability distributions of future climate change do not accurately represent genuine levels of uncertainty. The authors present some considerations for the use of probabilistic representations, and discuss alternatives.
-
Andrade Junior, Cardoso-Silva, Bezerra (2021). Comparing Contextual Embeddings for Semantic Textual Similarity in Portuguese. BRACIS 2021: Intelligent Systems, 389-404, Springer.
http://dx.doi.org/10.1007/978-3-030-91699-2_27
This paper uses several deep learning architectures to analyse text written in the Portuguese language. It compares pre-trained deep learning models (transfer learning) and provide examples in which fine-tuning such models have made predictions worse or better.
-
De Bacco, Contisciani, Cardoso-Silva, Safdari, Baptista, Sweet, Young, Koster, Ross, McElreath, and Redhead (2021). Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data.
arXiv preprint arXiv:2112.11396. (under review)
This article proposes a new Bayesian statistical model to infer a network of social support from collected data. The paper shows that the model can recover the “ground truth” of social ties much more accurately than common naïve approaches in simulation experiments, and it demonstrates its applicability to two real data sets.
-
Costa Avelar, Lamb, Tsoka, Cardoso-Silva (2021). Weekly Bayesian modelling strategy to predict deaths by COVID-19: a model and case study for the state of Santa Catarina, Brazil. (under review)
Preprint: https://arxiv.org/abs/2104.01133
This article proposes an extension to a Bayesian statistical model used to forecast deaths caused by COVID-19. In the paper, new equations are incorporated to take reported cases into account and a strategy to re-calibrate the model every week
- Hendrickx, Arcucci, Amador Díaz López, Guo, Kennedy (2021) Correcting public opinion trends through Bayesian data assimilation. CoRR abs/2105.14276.
This paper aims to merge data from traditional survey polling and Twitter opinion mining techniques using Bayesian data assimilation to arrive at a more accurate estimate of true public opinion for the Brexit referendum.
- Wilson, Guivarch, Kriegler, van Ruijven, van Vuuren, Krey, Schwanitz and Thompson (2021). Evaluating Process-Based Integrated Assessment Models of Climate Change Mitigation, Climatic Change, 166(1), 1-22.
https://doi.org/10.1007/s10584-021-03099-9
This article is the first comprehensive synthesis of research on the evaluation of process-based Integrated Assessment Models (IAMs) of climate change mitigation pathways. The authors propose a systematic evaluation framework to establish the appropriate-ness, interpretability, credibility, and relevance of process-based IAMs as useful scientific tools for informing climate policy.
-
Amador Díaz López, Madhyastha (2021), A Focused Analysis of Twitter-based Disinformation from Foreign Influence Operations, Proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Αnalysis (KnOD 2021), 30th The Web Conference (WWW 2021)
This paper presents a focused study on disinformation from a foreign influence campaign over Twitter during the 2016 US presidential election. The authors introduce a new dataset of political disinformation related to a foreign influence operation.
-
Buizza, Quilodrán Casas, Nadler, Mack, Marrone, Titus, Le Cornec, Heylen, Dur, Baca Ruiz, Heaney, Amador Díaz Lopez, Kumar, Arcucci, Data Learning: Integrating Data Assimilation and Machine Learning, Journal of Computational Science, Volume 58, 2022, 101525, ISSN 1877-7503,
https://doi.org/10.1016/j.jocs.2021.101525
This paper provides an introduction to Data Learning, a field that integrates Data Assimilation and Machine Learning to overcome limitations in applying these fields to real-world data.