1
- ``` {r eval = TRUE , echo = FALSE , results = "hide" , warning = FALSE , message = FALSE }
2
- objects_in_memory_for_local_build <- ls()
3
- ```
4
-
5
1
# American National Election Studies (ANES) {-}
6
2
7
- [ ![ License: GPL v3] ( https://img.shields.io/badge/License-GPLv3-blue.svg )] ( https://www.gnu.org/licenses/gpl-3.0 )
8
-
9
- ``` {r , echo = FALSE }
10
-
11
- most_recent_build_date <- gsub( "-" , " " , if( dir.exists( "_bookdown_files/" ) ) as.Date( file.info( "_bookdown_files/" )$ctime ) else Sys.Date() )
12
-
13
- anes_badge <- paste0( "<img src='https://img.shields.io/badge/tested%20on%20my%20laptop:-" , most_recent_build_date , "-brightgreen' alt='Local Testing Badge'>" )
14
-
15
- ```
16
-
17
- ` r anes_badge `
3
+ [ ![ License: GPL v3] ( https://img.shields.io/badge/License-GPLv3-blue.svg )] ( https://www.gnu.org/licenses/gpl-3.0 ) <img src =' https://img.shields.io/badge/Tested%20Locally-Windows%20Laptop-brightgreen ' alt =' Local Testing Badge ' >
18
4
19
5
A time series recording belief, public opinion, and political participation back to Dewey vs. Truman.
20
6
@@ -48,7 +34,7 @@ Please skim before you begin:
48
34
49
35
Define a function to import a stata file as a data.frame:
50
36
51
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
37
+ ``` {r eval = FALSE , results = "hide" }
52
38
library(haven)
53
39
54
40
anes_import_dta <-
@@ -72,7 +58,7 @@ anes_import_dta <-
72
58
73
59
3 . Download the ` STATA ` version of the February 10, 2022 file:
74
60
75
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
61
+ ``` {r eval = FALSE , results = "hide" }
76
62
library(haven)
77
63
78
64
anes_fn <-
@@ -84,25 +70,25 @@ anes_fn <-
84
70
anes_df <- anes_import_dta( anes_fn )
85
71
```
86
72
87
- ### Save locally \ {-}
73
+ ### Save Locally \ {-}
88
74
89
75
Save the object at any point:
90
76
91
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
77
+ ``` {r eval = FALSE , results = "hide" }
92
78
# anes_fn <- file.path( path.expand( "~" ) , "ANES" , "this_file.rds" )
93
79
# saveRDS( anes_df , file = anes_fn , compress = FALSE )
94
80
```
95
81
96
82
Load the same object:
97
83
98
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
84
+ ``` {r eval = FALSE , results = "hide" }
99
85
# anes_df <- readRDS( anes_fn )
100
86
```
101
87
102
88
### Survey Design Definition {-}
103
89
Construct a complex sample survey design:
104
90
105
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
91
+ ``` {r eval = FALSE , results = "hide" }
106
92
library(survey)
107
93
108
94
anes_design <-
@@ -118,7 +104,7 @@ anes_design <-
118
104
### Variable Recoding {-}
119
105
120
106
Add new columns to the data set:
121
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
107
+ ``` {r eval = FALSE , results = "hide" }
122
108
anes_design <-
123
109
update(
124
110
anes_design ,
@@ -156,15 +142,15 @@ anes_design <-
156
142
### Unweighted Counts {-}
157
143
158
144
Count the unweighted number of records in the survey sample, overall and by groups:
159
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
145
+ ``` {r eval = FALSE , results = "hide" }
160
146
sum( weights( anes_design , "sampling" ) != 0 )
161
147
162
148
svyby( ~ one , ~ undoc_kids , anes_design , unwtd.count )
163
149
```
164
150
165
151
### Weighted Counts {-}
166
152
Count the weighted size of the generalizable population, overall and by groups:
167
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
153
+ ``` {r eval = FALSE , results = "hide" }
168
154
svytotal( ~ one , anes_design )
169
155
170
156
svyby( ~ one , ~ undoc_kids , anes_design , svytotal )
@@ -173,35 +159,35 @@ svyby( ~ one , ~ undoc_kids , anes_design , svytotal )
173
159
### Descriptive Statistics {-}
174
160
175
161
Calculate the mean (average) of a linear variable, overall and by groups:
176
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
162
+ ``` {r eval = FALSE , results = "hide" }
177
163
svymean( ~ republican_party_rating , anes_design , na.rm = TRUE )
178
164
179
165
svyby( ~ republican_party_rating , ~ undoc_kids , anes_design , svymean , na.rm = TRUE )
180
166
```
181
167
182
168
Calculate the distribution of a categorical variable, overall and by groups:
183
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
169
+ ``` {r eval = FALSE , results = "hide" }
184
170
svymean( ~ think_gov_spend_least , anes_design , na.rm = TRUE )
185
171
186
172
svyby( ~ think_gov_spend_least , ~ undoc_kids , anes_design , svymean , na.rm = TRUE )
187
173
```
188
174
189
175
Calculate the sum of a linear variable, overall and by groups:
190
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
176
+ ``` {r eval = FALSE , results = "hide" }
191
177
svytotal( ~ republican_party_rating , anes_design , na.rm = TRUE )
192
178
193
179
svyby( ~ republican_party_rating , ~ undoc_kids , anes_design , svytotal , na.rm = TRUE )
194
180
```
195
181
196
182
Calculate the weighted sum of a categorical variable, overall and by groups:
197
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
183
+ ``` {r eval = FALSE , results = "hide" }
198
184
svytotal( ~ think_gov_spend_least , anes_design , na.rm = TRUE )
199
185
200
186
svyby( ~ think_gov_spend_least , ~ undoc_kids , anes_design , svytotal , na.rm = TRUE )
201
187
```
202
188
203
189
Calculate the median (50th percentile) of a linear variable, overall and by groups:
204
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
190
+ ``` {r eval = FALSE , results = "hide" }
205
191
svyquantile( ~ republican_party_rating , anes_design , 0.5 , na.rm = TRUE )
206
192
207
193
svyby(
@@ -215,7 +201,7 @@ svyby(
215
201
```
216
202
217
203
Estimate a ratio:
218
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
204
+ ``` {r eval = FALSE , results = "hide" }
219
205
svyratio(
220
206
numerator = ~ republican_party_rating ,
221
207
denominator = ~ democratic_party_rating ,
@@ -227,18 +213,18 @@ svyratio(
227
213
### Subsetting {-}
228
214
229
215
Restrict the survey design to party id: independent:
230
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
216
+ ``` {r eval = FALSE , results = "hide" }
231
217
sub_anes_design <- subset( anes_design , v201231x == 4 )
232
218
```
233
219
Calculate the mean (average) of this subset:
234
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
220
+ ``` {r eval = FALSE , results = "hide" }
235
221
svymean( ~ republican_party_rating , sub_anes_design , na.rm = TRUE )
236
222
```
237
223
238
224
### Measures of Uncertainty {-}
239
225
240
226
Extract the coefficient, standard error, confidence interval, and coefficient of variation from any descriptive statistics function result, overall and by groups:
241
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
227
+ ``` {r eval = FALSE , results = "hide" }
242
228
this_result <- svymean( ~ republican_party_rating , anes_design , na.rm = TRUE )
243
229
244
230
coef( this_result )
@@ -262,17 +248,17 @@ cv( grouped_result )
262
248
```
263
249
264
250
Calculate the degrees of freedom of any survey design object:
265
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
251
+ ``` {r eval = FALSE , results = "hide" }
266
252
degf( anes_design )
267
253
```
268
254
269
255
Calculate the complex sample survey-adjusted variance of any statistic:
270
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
256
+ ``` {r eval = FALSE , results = "hide" }
271
257
svyvar( ~ republican_party_rating , anes_design , na.rm = TRUE )
272
258
```
273
259
274
260
Include the complex sample design effect in the result for a specific statistic:
275
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
261
+ ``` {r eval = FALSE , results = "hide" }
276
262
# SRS without replacement
277
263
svymean( ~ republican_party_rating , anes_design , na.rm = TRUE , deff = TRUE )
278
264
@@ -281,28 +267,28 @@ svymean( ~ republican_party_rating , anes_design , na.rm = TRUE , deff = "replac
281
267
```
282
268
283
269
Compute confidence intervals for proportions using methods that may be more accurate near 0 and 1. See ` ?svyciprop ` for alternatives:
284
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
270
+ ``` {r eval = FALSE , results = "hide" }
285
271
svyciprop( ~ primary_voter , anes_design ,
286
272
method = "likelihood" , na.rm = TRUE )
287
273
```
288
274
289
275
### Regression Models and Tests of Association {-}
290
276
291
277
Perform a design-based t-test:
292
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
278
+ ``` {r eval = FALSE , results = "hide" }
293
279
svyttest( republican_party_rating ~ primary_voter , anes_design )
294
280
```
295
281
296
282
Perform a chi-squared test of association for survey data:
297
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
283
+ ``` {r eval = FALSE , results = "hide" }
298
284
svychisq(
299
285
~ primary_voter + think_gov_spend_least ,
300
286
anes_design
301
287
)
302
288
```
303
289
304
290
Perform a survey-weighted generalized linear model:
305
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
291
+ ``` {r eval = FALSE , results = "hide" }
306
292
glm_result <-
307
293
svyglm(
308
294
republican_party_rating ~ primary_voter + think_gov_spend_least ,
@@ -326,7 +312,7 @@ This example matches statistics and standard errors in the Age rows of the `ANES
326
312
327
313
4 . Download the ` DTA ` version of the ` Methodology File December 10, 2018 `
328
314
329
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
315
+ ``` {r eval = FALSE , results = "hide" }
330
316
anes2016_fn <-
331
317
file.path(
332
318
path.expand( "~" ) ,
@@ -389,7 +375,7 @@ This example matches statistics and standard errors in the Age rows of the `Desi
389
375
390
376
5 . Download the ` DTA ` version of the April 26, 2007 file
391
377
392
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
378
+ ``` {r eval = FALSE , results = "hide" }
393
379
anes2004_fn <-
394
380
file.path(
395
381
path.expand( "~" ) ,
@@ -447,12 +433,12 @@ stopifnot( all( round( SE( results ) , 4 ) == published_standard_errors ) )
447
433
448
434
The R ` srvyr ` library calculates summary statistics from survey data, such as the mean, total or quantile using [ dplyr] ( https://github.com/tidyverse/dplyr/ ) -like syntax. [ srvyr] ( https://github.com/gergness/srvyr ) allows for the use of many verbs, such as ` summarize ` , ` group_by ` , and ` mutate ` , the convenience of pipe-able functions, the ` tidyverse ` style of non-standard evaluation and more consistent return types than the ` survey ` package. [ This vignette] ( https://cran.r-project.org/web/packages/srvyr/vignettes/srvyr-vs-survey.html ) details the available features. As a starting point for ANES users, this code replicates previously-presented examples:
449
435
450
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
436
+ ``` {r eval = FALSE , results = "hide" }
451
437
library(srvyr)
452
438
anes_srvyr_design <- as_survey( anes_design )
453
439
```
454
440
Calculate the mean (average) of a linear variable, overall and by groups:
455
- ``` {r cache = TRUE , warning = FALSE , message = FALSE , results = "hide" }
441
+ ``` {r eval = FALSE , results = "hide" }
456
442
anes_srvyr_design %>%
457
443
summarize( mean = survey_mean( republican_party_rating , na.rm = TRUE ) )
458
444
@@ -462,7 +448,3 @@ anes_srvyr_design %>%
462
448
```
463
449
464
450
465
- ``` {r eval = TRUE , echo = FALSE , results = "hide" , warning = FALSE , message = FALSE }
466
- rm( list = setdiff( ls() , objects_in_memory_for_local_build ) ) ; gc()
467
- ```
468
-
0 commit comments