Title: | Neural Network for Complex Survey Data |
---|---|
Description: | The surveynnet package extends the functionality of nnet (Venables and Ripley, 2002), which already supports survey weights, by enabling it to handle clustered and stratified data. It achieves this by incorporating design effects through the use of effective sample sizes in the calculations, performed by the package described in Valliant et al. (2023), by following the methods outlined by Chen and Rust (2017) and Valliant et al. (2018). |
Authors: | Aaron Cohen [aut, cre] , Raul Cruz-Cano [aut] |
Maintainer: | Aaron Cohen <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0 |
Built: | 2024-12-10 05:49:16 UTC |
Source: | https://github.com/237triangle/surveynnet |
Simple body fat example data
body_fat
body_fat
body_fat
A data frame with 12 rows and 9 columns:
Subject ID
group
weight
height
age
percent body fat
survey weight
stratum
cluster
Nhanes example
nhanes.demo
nhanes.demo
nhanes.demo
A data frame with 6230 rows and 9 columns:
Respondent sequence number
Masked variance pseudo-PSU
Masked variance pseudo-stratum
Full sample 2 year MEC exam weight
Standing height (cm)
Weight (kg)
Body maxx index (kg/m**2)
Systolic blood pressure
Diastolic blood pressure
Predict response from fitted nnet, using new data
## S3 method for class 'surveynnet' predict(object, newdat, ...)
## S3 method for class 'surveynnet' predict(object, newdat, ...)
object |
The surveynnet object (returned by |
newdat |
The matrix or data frame of test examples. Must be of the same structure as the data matrix used to fit the surveynnet object. |
... |
arguments passed to or from other methods |
The matrix/vector of values returned by the trained network. Note: it is possible
to pass type = "raw" or type = "class" as appropriate. See predict.nnet()
for more details.
# From the example in `surveynnet` help file: y <- body_fat$pct_body_fat x <- body_fat[,c("Weight_kg", "Height_cm", "Age")] weight <- body_fat$survey_wt strat <- body_fat$stratum clust <- body_fat$cluster y[strat==1] <- y[strat==1] + 30*0.00015*rnorm(sum(strat==1)) y[strat==2] <- y[strat==2] + 30*0.15*rnorm(sum(strat==2)) myout <- surveynnet(x,y,weight = weight, strat = strat, clust=clust) newdat <- 2*x+rnorm(dim(x)[1]) predict(myout, newdat = newdat)
# From the example in `surveynnet` help file: y <- body_fat$pct_body_fat x <- body_fat[,c("Weight_kg", "Height_cm", "Age")] weight <- body_fat$survey_wt strat <- body_fat$stratum clust <- body_fat$cluster y[strat==1] <- y[strat==1] + 30*0.00015*rnorm(sum(strat==1)) y[strat==2] <- y[strat==2] + 30*0.15*rnorm(sum(strat==2)) myout <- surveynnet(x,y,weight = weight, strat = strat, clust=clust) newdat <- 2*x+rnorm(dim(x)[1]) predict(myout, newdat = newdat)
The surveynnet package extends the functionality of nnet (Venables and Ripley, 2002), which already supports survey weights, by enabling it to handle clustered and stratified data. It achieves this by incorporating design effects through the use of effective sample sizes in the calculations, performed by the package described in Valliant et al. (2023), by following the methods outlined by Chen and Rust (2017) and Valliant et al. (2018).
surveynnet(x, y, weight, strat, clust, comp_cases = F, ...)
surveynnet(x, y, weight, strat, clust, comp_cases = F, ...)
x |
Matrix or data frame of predictors. Must not contain any missing values. |
y |
Vector of targets / response values. Must not contain any missing values. |
weight |
The weights for each sample. |
strat |
The stratum for each sample. |
clust |
The cluster for each sample. |
comp_cases |
If TRUE, filter out missing values from x, y, weight, strat, and clust. Default FALSE. Note that in either case, the dimensions of all data mentioned above must agree. |
... |
Additional arguments to be passed into |
A list containing two objects:
A dataframe with the fitted values of the neural nets, using: no weights ("fitted"), the user-inputted weights ("fitted_weighted"), and the new method that adjusts the weights by using a design effect incorporating cluster and strata ("fitted_deff").
The fitted neural network object (from nnet
), using the novel design-effect based weights; this
can be used to predict the outcomes for new observations.
Venables WN, Ripley BD (2002). Modern Applied Statistics with S, Fourth edition. Springer, New York. ISBN 0-387-95457-0, https://www.stats.ox.ac.uk/pub/MASS4/.
Chen, S., and K. F. Rust. 2017."An Extension of Kish’s Formula for Design Effects to Two- and Three-Stage Designs with Stratification.”, Journal of Survey Statistics and Methodology,5 (2): 111–30.
Valliant, R., J. A. Dever, and F. Kreuter. 2018. Practical Tools for Designing and Weighting Survey Samples .2nd ed. New York: Springer-Verlag.
Valliant, R., J. A. Dever, and F. Kreuter. 2023. PracTools: Tools for Designing and Weighting Survey Samples, Version 1.4 . https://CRAN.R-project.org/package=PracTools
# short example with body fat dataset y <- body_fat$pct_body_fat x <- body_fat[,c("Weight_kg", "Height_cm", "Age")] weight <- body_fat$survey_wt strat <- body_fat$stratum clust <- body_fat$cluster y[strat==1] <- y[strat==1] + 30*0.00015*rnorm(sum(strat==1)) y[strat==2] <- y[strat==2] + 30*0.15*rnorm(sum(strat==2)) myout <- surveynnet(x,y,weight = weight, strat = strat, clust=clust) myout # NHANES example # Predicting Diastolic BP from BMI, Systolic BP and Height # PLEASE NOTE: for this example, pass "nest=TRUE" into the # "..." parameters of the main function `surveynnet` x <- nhanes.demo[,c("BMXBMI", "BPXSY1", "BMXHT")] weight <- nhanes.demo$WTMEC2YR strat <- nhanes.demo$SDMVSTRA clust <- nhanes.demo$SDMVPSU y <- nhanes.demo$BPXDI1 myout <- surveynnet(x,y,weight = weight, strat = strat, clust=clust, nest=TRUE) head(myout$results, 15)
# short example with body fat dataset y <- body_fat$pct_body_fat x <- body_fat[,c("Weight_kg", "Height_cm", "Age")] weight <- body_fat$survey_wt strat <- body_fat$stratum clust <- body_fat$cluster y[strat==1] <- y[strat==1] + 30*0.00015*rnorm(sum(strat==1)) y[strat==2] <- y[strat==2] + 30*0.15*rnorm(sum(strat==2)) myout <- surveynnet(x,y,weight = weight, strat = strat, clust=clust) myout # NHANES example # Predicting Diastolic BP from BMI, Systolic BP and Height # PLEASE NOTE: for this example, pass "nest=TRUE" into the # "..." parameters of the main function `surveynnet` x <- nhanes.demo[,c("BMXBMI", "BPXSY1", "BMXHT")] weight <- nhanes.demo$WTMEC2YR strat <- nhanes.demo$SDMVSTRA clust <- nhanes.demo$SDMVPSU y <- nhanes.demo$BPXDI1 myout <- surveynnet(x,y,weight = weight, strat = strat, clust=clust, nest=TRUE) head(myout$results, 15)