Comparing SAE using sample splitting. Our approach introduces cross validation schemes that preserve stratification and clustering from complex survey designs. We define error metrics that assess model accuracy in estimating finite population prevalence using direct estimates. We further develop a decomposition formula for the naive squared error estimates from the cross validation, which leads to finite sample adjustments to improve the scoring metrics.
Prevalence mapping for categorical data, For area-level models, we apply continuation-ratio logit transformations to design-based direct estimates and model the transformed outcomes using a multivariate Fay–Herriot model. We also explore extending similar random effect construction to unit-level models, where we directly model cluster-level categorical response.
Softwear I have been paticipted in developing:
surveyPrev: R package for processing, modeling, and visualizing the prevalence of binary health indicators in Demographic and Health Surveys (DHS).
[cran] [github]sae4health: R ‘shiny’ application for generating subnational estimates and prevalence maps of 150+ binary indicators.
[shiny][cran]Multi-Indicator SAE Estimates: website providing pre-calculated estimates for selected countries and indicators.
[website]More software details can be find here.