Art of the possible: LLM-enhanced research methods for large scale qualitative data
LLMs provide unprecedented potential for analysing qualitative data at scale, but that potential cannot be achieved with ad-hoc chatbot interactions. We need structured, rigorous and auditable pipelines.
In this talk, Lee Mager, DSI Faculty Affiliate and Digital Innovation Lead at LSE Law School, draws on three large-scale research projects which:
- classify 100,000+ foreign investment projects against EU sustainability criteria
- code corporate governance documents from S&P 500 companies’ SEC filings across 25 years
- comparative analysis of news media discourse on critical raw materials
It addresses the practical and methodological challenges of deploying LLMs as automated 'method execution' assistants, carefully guided and evaluated by human researchers.
Key topics include:
- data curation (practical and legal constraints)
- designing pipelines that combine deterministic code with targeted, researcher-guided and tightly-defined LLM micro-tasks
- keeping context windows small and tasks well-defined to minimise errors
- building audit trails through structured outputs and verbatim quotes
- constant iterations and evaluations for accuracy, validity and reproducibility through human review, internal consistency checks, cross-model verification, and prompt sensitivity testing.
LSE holds a wide range of events, covering many of the most controversial issues of the day, and speakers at our events may express views that cause offence. The views expressed by speakers at LSE events do not reflect the position or views of the London School of Economics and Political Science.