The LSE Social Statistics group is delighted to announce this one day workshop.
This workshop takes place in the 'BOX' on the fifth floor, Tower 3 on the LSE campus on Tuesday 27 November 2012, staring at 10.30am. Attendance is free of charge, but registration is essential. Please email Sara Geneletti to register.
Researchers using statistical modelling in the analysis of survey data in social and medical research are often unsure how to make use of survey weights. In recent years a number of new methods have been proposed for incorporating survey weights into statistical models. This one day workshop looks at novel approaches to this problem in the context of small area estimation, multiple imputation and evidence synthesis. The talks will focus on how to apply these methods with real examples.
The provisional timetable for the meeting is:
10:45-10:55 Sara Geneletti (LSE): Introduction
10:55-11:25 Chris Skinner (LSE) : Incorporating survey weights into modelling: an overview
11:25-12:05 Michael R Elliott (University of Michigan School of Public Health): Bootstrapping Bayes estimators in complex sample designs
13:00-13:40 James Carpenter (London School of Hygiene and Tropical Medicine): Using survey weights with multiple imputation: a multilevel approach
13:40-14:20 Shaun Seaman (MRC Biostatistics Unit): Combining multiple imputation and inverse probability weighting
14:35-15:15 Jouni Kuha (LSE): Group means as explanatory variables in multilevel models
15:15-15:35 Discussants Sara Geneletti (LSE) and Nicky Best (Imperial College London)
15:35-16:00 Open discussion
Using survey weights with multiple imputation: a multilevel approach
Multiple imputation for partially observed data with survey weights has been criticized, because Rubin's variance formula may be invalid. Focusing on a regression setting, we discuss this criticism, and argue that multilevel imputation can address some of the issues. We illustrate with some simulations and an analysis of the Youth Cohort Study.
Michael R Elliott
Bootstrapping Bayes estimators in complex sample designs
Bayesian applications typically postulate independent observations in order to obtain tractable forms of likelihoods. Sample designs that involve clustering are easily handled in this setting by assuming independence conditional on latent random effects associated with each cluster, which can then be easily integrated out, typically through the use of numerical methods (e.g., Markov Chain Monte Carlo). However, other complex sample design features such as sampling weights and stratification are not so easily incorporated into Bayesian estimators (Gellman 2007). In the regression setting weights can be incorporated in model estimation by including interaction terms between weights and regression parameters (Elliott 2007); in other settings such as small area estimation summary statistics can be computed that incorporate sample design (Raghunathan et al. 2007). But a general solution to the problem of Bayesian estimation using data from complex sample designs has not been proposed. In this talk I will propose the use of posterior predictive distributions of a population from finite population Bayesian bootstraps (FPBB) (Dong, Elliott, Raghunathan 2012) as one such possible general solution. FPBB produce posterior predictive distributions that have simple random sample properties that can be used to derive posterior estimates of the parameters of interest by reweighting simulated draws from the original sample. I will consider the repeated sampling properties of this method in a simulation study.
Group means as explanatory variables in multilevel models
Research questions for clustered data often concern the effects of cluster-level averages of individual-level variables. Typically such averages are estimated using just data on respondents in a survey. This incurs a measurement error bias when these estimates are used as explanatory variables in modelling. The error variance can, however, be estimated, and we can then apply statistical measurement error methods to adjust for the error. This talk considers such estimation for generalised linear mixed models. The proposed approach involves two stages of estimation where survey weights and other information on sampling design may need to be used in the first stage even when they will not be in the second.
Combining multiple imputation and inverse probability weighting
Multiple imputation (MI) is a technique commonly used to deal with missing values when analysing data. Another technique used for the same purpose is inverse probability weighting (IPW). Inverse probability weights are also used to deal with differential sampling fractions in survey data. These latter weights are known as sampling (or survey) weights. MI may be used in combination with IPW for either or both of two reasons. First, MI may be used to deal with missing data in a study with sampling weights. Second, MI may be used only to impute isolated missing values and then inverse probability weights used to account for the remaining larger blocks of missing data. I shall compare MI and IPW, discuss why and how the two may be combined, and present theoretical and simulation results concerning the unbiasedness of parameter and variance estimates based on Rubin's Rules when MI and IPW are thus combined. An example involving data from the National Child Development Study will be presented.
Incorporating survey weights into modelling: an overview
This talk will aim to provide an overview of : types of survey weights and why weights are needed; standard methods for incorporating weights into modelling, such as weighted maximum likelihood methods and weights as covariates; challenges for incorporating weights into Bayesian modelling; weight modification and alternatives to weighting