AI tools risk downplaying women’s health needs in social care

Large language models (LLMs), used by over half of England’s local authorities to support social workers, may be introducing gender bias into care decisions, according to new research from LSE's Care Policy & Evaluation Centre (CPEC) funded by the National Institute for Health and Care Research.
Published in the journal BMC Medical Informatics and Decision Making, the research found that Google’s widely used AI model ‘Gemma’ downplays women’s physical and mental issues in comparison to men’s when used to generate and summarise case notes.
Terms associated with significant health concerns, such as “disabled,” “unable,” and “complex,” appeared significantly more often in descriptions of men than women. Similar care needs among women were more likely to be omitted or described in less serious terms.
Large language models are increasingly being used to ease the administrative workload of social workers and the public sector more generally. However, it remains unclear which specific models are being deployed by councils—and whether they may be introducing bias.
Dr Sam Rickman, lead author of the report and a researcher in CPEC, said: “If social workers are relying on biased AI-generated summaries that systematically downplay women’s health needs, they may assess otherwise identical cases differently based on gender rather than actual need. Since access to social care is determined by perceived need, this could result in unequal care provision for women.”
To investigate potential gender bias, Dr Rickman used large language models to generate 29,616 pairs of summaries based on real case notes from 617 adult social care users. Each pair described the same individual, with only the gender swapped, allowing for a direct comparison of how male and female cases were treated by the AI. The analysis revealed statistically significant gender differences in how physical and mental health issues were described.
Among the models tested, Google’s AI model, Gemma, exhibited more pronounced gender-based disparities than benchmark models developed by either Google or Meta in 2019. Meta’s Llama 3 model – which is of the same generation as Google’s Gemma - did not use different language based on gender.
Dr Rickman said: “Large language models are already being used in the public sector, but their use must not come at the expense of fairness. While my research highlights issues with one model, more are being deployed all the time making it essential that all AI systems are transparent, rigorously tested for bias and subject to robust legal oversight.”
The study is the first to quantitatively measure gender bias in LLM-generated case notes from real-world care records, using both state-of-the-art and benchmark models. It offers a detailed, evidence-based evaluation of the risks of AI in practice, specifically in the context of adult social care.
More information
The research paper Evaluating Gender Bias in Large Language Models in Long-term Care is authored by Sam Rickman from the Care Policy and Evaluation Centre (CPEC) at LSE has been accepted for publication by BMC Medical Informatics and Decision Making in July 2025. It will be published in the Natural language processing in medical informatics special collection.
DISCLAIMER
This research is based on independent research partly funded through the NIHR Policy Research Unit in Adult Social Care, reference NIHR206126. The views expressed are those of the author and not necessarily those of the NIHR or the Department of Health and Social Care.
The mission of the National Institute for Health and Care Research (NIHR) is to improve the health and wealth of the nation through research. We do this by:
- Funding high quality, timely research that benefits the NHS, public health and social care;
- Investing in world-class expertise, facilities and a skilled delivery workforce to translate discoveries into improved treatments and services;
- Partnering with patients, service users, carers and communities, improving the relevance, quality and impact of our research;
- Attracting, training and supporting the best researchers to tackle complex health and social care challenges;
- Collaborating with other public funders, charities and industry to help shape a cohesive and globally competitive research system;
- Funding applied global health research and training to meet the needs of the poorest people in low and middle income countries.
NIHR is funded by the Department of Health and Social Care. Its work in low and middle income countries is principally funded through UK international development funding from the UK government.