-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
123 lines (82 loc) · 4.52 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, echo = FALSE, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
options(width = 110)
library(fetchdhs)
library(tidyverse)
```
[](https://travis-ci.org/murphy-xq/fetchdhs)
# fetchdhs
The [Demographic and Health Surveys (DHS) Program](https://www.dhsprogram.com/) has conducted more than 400 surveys in over 90 countries since 1984. It remains a critical resource in global health research and analytics as it provides nationally representative data on fertility, family planning, maternal and child health, gender, HIV/AIDS, malaria, and nutrition. The objective of this package is to enable R users to plug into the [DHS API](https://api.dhsprogram.com/) and retrieve tidy survey data.
## Installation
```{r installation, eval = FALSE}
# install.packages("devtools")
devtools::install_github("murphy-xq/fetchdhs")
```
## Basic example
Let's say you quickly need DHS survey data for:
* immunization-related indicators
* since 2000
* for India and Nigeria
First, use `fetch_countries()` to locate the 2-letter DHS country codes for India and Nigeria needed for the api call
```{r country_list_filter}
fetch_countries() %>%
filter(country_name %in% c("India", "Nigeria"))
```
Next, make use of [DHS API tags](https://api.dhsprogram.com/rest/dhs/tags?f=html) that categorize survey indicators by topic. In this example, we are looking for all immunization-related indicators using `fetch_tags()` and identify tag `32`
```{r tags_list}
fetch_tags() %>%
filter(str_detect(tag_name, "[Ii]mmunization"))
```
Finally, use `fetch_data()` to call the DHS API using the parameters just identified and receive a tidy dataframe as well as the api call:
* Note only 1 tag may be specified
```{r fetch_dta}
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017)
```
For specific indicators, we can peek at a dataframe of all available indicators to identify which `indicator_id` codes should be included with `fetch_data()`. Let's try pulling only DPT3 and Measles indicators:
```{r find_indicators}
fetch_indicators() %>%
filter(str_detect(definition, "Measles|DPT3"))
```
Upon investigating the DPT3 and Measles indicators and their associated [attributes](https://api.dhsprogram.com/rest/dhs/indicators/fields), we see that we need to use `CH_VACC_C_DP3` and `CH_VACC_C_MSL`:
```{r fetch_indicator}
fetch_data(countries = c("IA","NG"), indicators = c("CH_VACC_C_DP3", "CH_VACC_C_MSL"), years = 2000:2017)
```
## Additional features
#### Level of disaggregation
We have been using the default level of disaggregation which returns national-level data only. In order to pull subnational, background characteristic, or all available data, we need to specify the `breakdown_level` in `fetch_data()`
* Note the increasing amount of records returned as we move from the default `breakdown_level == "national"` (205 records) to `breakdown_level == "all"` (3,270 records)
```{r breakdown}
# national (default)
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, breakdown_level = "national")
# subnational
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, breakdown_level = "subnational")
# background
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, breakdown_level = "background")
# all
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, breakdown_level = "all")
```
#### Return fields
[Return fields](https://api.dhsprogram.com/rest/dhs/data/fields) are the various dimensions of survey data that can be returned. `set_return_fields()` allows the user to specify which fields should comprise the dataframe returned from the api
```{r return_field}
set_return_fields(c("Indicator", "CountryName", "SurveyYear", "SurveyType", "Value"))
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017)
```
#### Add geospatial data
We can also include polygon coordinates with our call with `add_geometry`
```{r add_geom}
fetch_data(countries = c("IA","NG"), tag = 32, years = 2000:2017, add_geometry = TRUE)
```
#### API Key
Authenticated users can query more records per page -- 5,000 versus 1,000 maximum records per page. Please see [here](https://api.dhsprogram.com/#/introdevelop.html) for authentication details.
Users can input their api key with `set_api_key()` for inclusion in any subsequent `fetch_data()` calls.
```{r set_api_key}
set_api_key("YOURKEY-GOESHERE")
```