Propensity score methods are broadly employed with observational data as a tool to achieve covariate balance, but how to implement them in complex surveys is less studied – in particular, when the survey weights depend on the group variable under comparison.

In this package, we focus on the specific case when sample selection depends the comparison groups of interest. We implement identification formulas to properly estimate the average controlled difference (ACD), or under stronger assumptions, the population average treatment effect (ATE) in outcomes between groups, with appropriate weighting for both covariate imbalance and generalizability.

This packages also contains the code necessary to reproduce the motivating data analysis in “What’s the weight? Estimating controlled outcome differences in complex surveys for health disparities research.” This analysis focuses on data from the National Health and Nutrition Examination Survey (NHANES), investigating the interplay of race and social determinants of health when our interest lies in estimating racial differences in mean telomere length.

Installation

You can install the development version of svycdiff like so:

#--- CRAN Version
install.packages("svycdiff")
#--- Development Version
# install.packages("devtools")
devtools::install_github("salernos/svycdiff")

Example

This is a basic example usage via simulated data:

library(svycdiff)

N <- 1000

dat <- simdat(N)

S <- rbinom(N, 1, dat$P_S_cond_AX)

samp <- dat[S == 1,]

y_mod <- Y ~ A * X

a_mod <- A ~ X

s_mod <- P_S_cond_AX ~ A + X

fit <- svycdiff(samp, "OM", a_mod, s_mod, y_mod, "gaussian")

fit

Vignette

Once you have svycdiff installed, you can type

vignette("svycdiff")

in R to bring up a tutorial on svycdiff and how to use it. To access the vignettes in the developer version, please install the package with

devtools::install_github("salernos/svycdiff", build_vignettes = TRUE)

Questions

For technical details on the method, see please refer to Salerno et al. (2024+) “What’s the weight? Estimating controlled outcome differences in complex surveys for health disparities research.” To reproduce the analysis results for the main paper, see inst/nhanes.Rmd. For questions and comments, please contact Stephen Salerno ().