Census strand abstracts

Census issues: Beyond 2011: Monday 9 September 1.30pm

Variability and Accuracy of Scottish population estimates from administrative data sources
Alastair Greig, National Records of Scotland

For Scotland, a credible alternative to census-based population estimates hinges upon the feasibility of applying a competent coverage and adjustment methodology that can produce population estimates comparable to that which would be achieved from a full census. Using responses to the 2011 census to infer coverage of administrative datasets, analysis is presented on the likely precision of an alternative population estimate under various sampling regimes and data quality assumptions. Altogether, the paper provides an overview of current thinking from Scotland’s Beyond 2011 project before speculating on the likely avenues of further development in the months and years ahead.


Estimating Scottish local population sizes from administrative data sets using Bayesian Monte Carlo methods
Stephen Sharp, Demography Division, National Records of Scotland; Peter Congdon, Department of Geography, Queen Mary University of London

This presentation reports research using Bayesian inferential methods to produce small area population estimates in Scotland using administrative data sources available for these areas. The main sources currently available are from the NHS Central Register, the Department of Work and Pensions (DWP) Customer Information Service, HMRC child benefit data, and the annual “Pupils in Scotland” census. The first two of these sources provide age-gender breakdowns across all ages and are thus of major relevance for any scheme to produce annual small area population estimates for adult ages if the Census were to be discontinued. Because there are a number of alternative sources of information available and none can be regarded as more definitive or accurate than others, the Bayesian approach using the WINBUGs software was a suitable methodology for combining information over the datasets to produce a single estimate. Random effects models were used to provide estimates for 2010 and 2011 for 6,505 Datazones (DZs) and 1,235 Intermediate Zones (IZs). Comparison with midyear population estimates for Scotland and for its constituent local authorities (LAs) show differences in age structure between the administrative datasets, such as undercounting of children under five by the DWP data. These age structure differences persist in DZ and IZ population estimates based on administrative data. Comparison at LA level shows that DZ and IZ estimates based on administrative data are also underestimates (relative to midyear estimates) of the populations of the two major Scottish cities. Constraining DZ and IZ estimates within LA midyear estimates would avoid systematic discrepancies for particular age bands or LAs. This would require that LA (and national) midyear estimates continue, with input from recurrent sample censuses that can provide revised benchmark population information for larger areas such as LAs. After constraining the DZ and IZ estimates (based on administrative data) to LA midyear estimates for 2010 and 2011, these small area estimates were then compared with the existing SAPE estimates for DZs made by National Records of Scotland which are necessarily constrained within midyear estimates. Comparison at this finer disaggregated scale shows that DZ estimates based on administrative data tend to produce lower population estimates for the inner city areas of Glasgow and Edinburgh, and higher estimates for outer city areas.


Population statistics in an administrative data based world
Charlie Wroth-Smith, Amy Large, Office for National Statistics

The Office for National Statistics is currently taking a fresh look at options for the production of population and small area socio-demographic statistics for England and Wales. The Beyond 2011 Programme has been established to carry out research on the options and to recommend the best way forward to meet future user needs. Beyond 2011 is considering a range of options including census, survey and administrative data solutions. Since 'census-type' solutions are relatively understood most of the research is focussing on how surveys can be supplemented by better re-use of 'administrative' data already collected from the public. This presentation will focus on research that has been undertaken to investigate administrative data-based models as possible providers of local authority estimates of the population by age and sex. These use anonymously linked administrative data in combination with a population coverage survey. The presentation describes the framework for producing population estimates under this approach, and will cover the design of statistical population datasets, sample design issues and the estimation framework. We will present results of both a simulation exercise, and early trial estimates to assess the likely quality of estimates arising from the different options. We will discuss these potential methods in the context of the wider set of population statistics and the opportunities this may provide. The plans for further work will also be described.


2011 Census in Scotland: Tuesday 10 September 9.00am

Scotland’s 2011 Census: Quality Assurance of small area population and household estimates and implications for future methods
Michael Hunter, Cecilia MacIntyre, National Records of Scotland

This study looks at the relationship between results from Scotland’s 2011 Census and the National Records of Scotland’s (NRS) annual Small Area Population and Household Estimates at data zone level: from quality assurance of the census and lessons learned, through to how we can improve our estimates in the future. This study concentrates on the population and household figures published at data zone level. Data zones cover the whole of Scotland and nest within local authority boundaries. Data zones are groups of 2001 Census output areas which originally had populations of between 500 and 1,000 household residents, and some effort was made to respect physical boundaries. In addition, they have compact shape and contain households with similar social characteristics. The publication of the population and household count at datazone level allows a greater insight into where the differences seen at council level have occurred.


The Census/Census Coverage Survey Matching Project for Scotland’s Census 2011
Neil Bowie, presented by Alex Stannard, National Records of Scotland

The importance of an accurate and reliable linkage between the Census enumeration results and the Census Coverage Survey (CCS) results cannot be understated. The CCS is a sample survey of about 1.5% of the Scottish population, taken 5 or 6 weeks after Census day (27th March 2011). The objective was to use these linkage results to derive an estimate of the entire Scottish population. For several years prior to the Census, National Records of Scotland (NRS) has been working on developing strategies and techniques in data linkage, and was therefore well positioned to design and conduct the Census/CCS matching project. The project was unusual in that household level information was available to be incorporated into the more usual individual linkage methodologies. A staged process using a combination of automatic matching, clerical matching and data interrogation was designed to ensure that maximum use could be made of the available data. Specialised software was developed to manage these stages as well as formalising the presentation of data to encourage accurate, reliable and consistent decision making. Furthermore, the software was designed to fully integrate the linkage process into the wider Census data processing systems. This paper describes the methodology employed for this project as well as some of the outcomes from the linkage. It also briefly describes ways in which the methodologies and tools are being further developed in Scotland so that they can be applied to other linkage projects in the future.


Scotland's Census 2011 - Quality assurance
Cecilia MacIntyre, National Records of Scotland

This talk will present the process used to quality assure the population and household estimates outputs from Scotland’s Census. The quality assurance of the published summary tables on population and households included looking at the following aspects:
• Comparisons with published sources:
• age by sex profile by council area
• household numbers
• household size
• student numbers
• Fertility rates
• The inferred sex ratio distribution by age
• The pattern of response by age and gender.
An External Data Quality Advisory Group examined census estimates for Scotland and council areas at stages throughout the process. The advisory group provided comments on the quality assurance process, the initial figures and also commentary on the emerging trends. The census estimates were examined by the QA team in NRS using an agreed list of checks. The results of these checks, along with the comments from the External Data Quality Advisory Group, were discussed with an Internal Quality Assurance Panel, drawn from topic experts from National Records of Scotland. This group examined the issues highlighted in the quality assurance, and provided an assessment of the quality of the census estimates for each age-sex group. It suggested supplementary work to be carried out by the QA and coverage and adjustment teams in NRS, and identified where there were differences with comparator sources. Quality assurance packs on the census population and household estimates for each council area in Scotland are published as part of the release.


Beyond 2011 (2): Tuesday 10 September 1.30pm

 Integrating surveys and administrative data to estimate population characteristics
Louise Morris, Alan Taylor, Salah Merad, Martin Ralphs, Office for National Statistics

The Office for National Statistics is currently taking a fresh look at options for the production of population and small area socio-demographic statistics for England and Wales through its Beyond 2011 Programme. A range of options including census, survey and administrative data solutions are being considered. A key focus of ongoing research by Beyond 2011 is the approach to production of socio-demographic outputs (statistics about population and household characteristics) under an administrative data based approach. This paper sets out proposals for the design of an integrated system to deliver socio-demographic outputs, bringing together administrative information with data collected directly via a survey. Research has shown that in the short-term it is likely that surveys will form the basis of a system design but research is also being undertaken into the application of small area estimation models to supplement surveys for the production of outputs for small geographic areas or population groups. Over time as topic and population coverage by administrative sources improves there may be increased opportunities for more direct production of estimates from administrative sources or for use of modelling. Administrative data may also be used to monitor change in geographic areas or population groups longitudinally allowing for a more targeted survey design. Results from work to look at initial survey design options will be discussed along with findings from initial research to explore the scope to further improve the system design in the longer-term by making use of small area estimation and administrative data with targeted surveys.


Public acceptability: the challenges for Beyond 2011
Genevieve Groom, Office for National Statistics

The Office for National Statistics is currently taking a fresh look at options for the production of population and small area socio-demographic statistics for England and Wales. The Beyond 2011 Programme has been established to carry out research on the options and to recommend the best way forward to meet future user needs. Improvements in technology and administrative data sources offer opportunities to either modernise the existing census process, or to develop an alternative by re-using existing data already held within government. The final recommendation, which will be made in 2014, will not only balance user needs, cost, benefit and statistical quality, but will also consider the public acceptability of all of the options. This paper draws upon the research that has been undertaken to investigate public attitudes and opinions relating to the use of personal data for statistical purposes. It will discuss the main areas of public concern identified by the research and the challenges faced by the Programme in ameliorating those concerns.


Making a case for small area Census statistics: exploring the spatial scale of population variables in England and Wales in 2011
Christopher D. Lloyd, University of Liverpool

Debates about the future of the Census in the UK have considered how far alternative forms of data collection could replace the functions of Census surveys. Defenders of the Census have argued that no other form of survey will be able to replicate closely the richness of the data available as outputs from the Census, both in terms of the array of variables and the level of geographical detail. This paper explores the scale of spatial variation in Output Area level data from the 2011 Census for England and Wales. The analysis considers the spatial structure of a set of demographic and socioeconomic variables (including age, housing tenure, NSSEC, LLTI, and ethnicity) and demonstrates how these population characteristics vary at multiple scales. If one population sub group tends to be clustered over small areas, (e.g., manual or professional NSSEC groups in urban areas), while another is quite homogeneous over large areas (e.g., owner occupiers), then we must have data at a fine enough scale to capture the clustering in the first group. Otherwise, it is impossible to properly explore the first variable, or the relationship between the first variable and the second. This paper considers questions such as ‘does the loss of geographically-detailed cross-tabulations prevent analysis of local differences in relationships between, for example, LLTI and ethnicity?’. Making use of measures of spatial autocorrelation including Moran’s I and the variogram, it is shown that meaningful analysis of many variables is only likely to be possible using the kinds of small area data available from the Census. The results provide compelling evidence that the termination of the UK Census would mean an end to much academic research, as well as a profound reduction in the ability of (local) government and policy makers to answer fundamental questions about the population of the UK, and the needs and characteristics of individual groups. The paper concludes by arguing that the Census is an irreplaceable tool for understanding the population of the UK, and not merely the luxury that some commentators believe.


Assessing the value of small area census data: an input to 'Beyond 2011' discussions
Paul Norman, University of Leeds; David Martin, University of Southampton

Placing a ‘value’ on academic research is far from easy. To date we might have been writing in a conclusion, ‘this research has implications for policy applications’, counting citations to our publications in other papers (straightforward now with tools like Google Scholar) and revelling in feedback from people who found our work interesting. Over the last few years we have been learning what ‘impact’ might be and that this and ‘knowledge exchange’ should be demonstrated clearly to funding councils (like ESRC). The HE ‘Research Excellence Framework’ 2014 will include submissions from academic departments of ‘Impact Case Studies’ for the first time. However, ‘value’ is also financial. Our National Statistics Agencies make a business case for the census and the options which Beyond 2011 will recommend regarding whether or not we have a 2012 Census and any alternatives will include a business case. As an input to this David Martin and Paul Norman carried out a trawl of academic research and have attempted to estimate the financial value of small area research carried out using 2001 Census data. This evidence has been presented to ONS (end of February 2013) and will be updated on 1st May 2013 in conjunction with a Beyond 2011 conference. This presentation to BSPS will provide information on the above and will update on anything related which transpires by September 2013.


Census analysis and dissemination: Wednesday 11 September, 11.00am

 Census Analysis Work Programme
Tristan Browne, Craig Taylor, Office for National Statistics

ONS are conducting a Census Analysis work programme with one of the main aims being to provide a coordinated and timely set of analytical products aiming to meet the needs of a wide ranging set of users. The work programme has now started to deliver a varying set of products and this presentation aims to provide a high level overview of the analysis that has been produced to date by ONS together with information on planned upcoming analysis. Details of the analysis undertaken so far together with the some of key findings on topics such as ethnicity, religion and the labour market will be highlighted and presented using some of the of the various dissemination options utilised. These include various data visualisations, interactive content & podcasts, which enables the analysis to reach out to a wide ranging set of users from policy makers to the general public. The Census Analysis work programme is not just about the analysis produced by the ONS but to have a joined up approach to analysis across the UK. The aim has been to have a central portal through which the majority of Census based analysis can be accessed helping to showcase this whilst also making users aware of the wide ranging analytical work produced across the UK. The presentation will focus on the approach that has been taken detailing the various functions and processes in place to coordinate the work and how analysts/academics can be part of this.


Origin Destination Products from the 2011 Census
Johannes Hechler, Johanna Hutchinson, Office for National Statistics

Origin-Destination (or flow data) are a unique, highly sought after resource for research and planning. With an interactive look-up facility this product maps the ’flows’ of people between any two areas. Origin-destination may be tracking the path of commuters from home to work, those moving residence between different areas, travelling from their main to second residence or mapping people formerly at a student term-time addresses. With the inclusion of geography as detailed as individual output areas, and coupled with an array of other socio-demographic variables 2011 Census Origin-Destination allows a specific detailed look at people movement across the UK. This presentation will detail the creation of flow products, demonstrating their use and accessibility.


2011 Census Microdata Products
Marcus Lewin, Johanna Hutchinson, Office for National Statistics

2011 Census Microdata provide samples of anonymised records. These unique datasets will enable further detailed and bespoke analyses of UK census data. Census Microdata products boast a large sample size, high response rate, wide range of socio-demographic topics and coverage of sub populations traditionally under-sampled in other surveys e.g. communal establishments. Microdata will be disseminated via a number of routes tailored to a wide variety of users, from secure access files containing large samples and a high level of detail for ‘approved researchers’ to a publicly available file with a smaller sample and lower level of detail in accordance with statistical disclosure procedures. This presentation gives an in-depth look at the production of census Microdata from the inclusion of data to security measures and determining sample size. Incorporating user feedback on the 2001 Census Microdata product, we demonstrate how incorporating new technology and methods helps to increase value to users creating an accessible, specialised product.


Innovations in census dissemination at the UK Data Service
Justin Hayes, University of Manchester

UK Data Service Census Support provides easy and comprehensive access to a range of outputs from the five UK censuses from 1971 to 2011, as well as supporting their use. It aims to make the outputs easier to find, understand and use appropriately to facilitate high quality social and economic research and education. It is part of the ESRC’s UK Data Service, combining the units of the former ESRC Census Programme, and is delivered by data experts at some of the UK's leading universities. UK Data Service Census Support builds upon a long history of research and innovation in data dissemination that has created interfaces such as Casweb (aggregate outputs), WICID (flow outputs), and Boundary Data Selector. More recently, the InFuse interface uses a fundamentally restructured version of the census aggregate outputs to permit simplified access to based directly on the selection of combinations of variables without the need to search through published tables. This presentation will provide an update on the UK Data Service Census Support and its interfaces and other services, with a particular focus on dissemination of outputs from the UK 2011 Census. It will also cover recent moves to lift academic-only restrictions on the interfaces to make them publically available in line with relaxed licensing conditions and ESRC policy.