-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path04-dataR.Rmd
610 lines (369 loc) · 23.6 KB
/
04-dataR.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
---
---
# (APPENDIX) Appendix {-}
# Data Wrangling in R
In this part of our toolkit, we're going to learn how to do the same things we did with Chapter 4 - Spatial Data Wrangling, but this time, we'll use R code to handle our spatial data.
### Getting Started {-}
R is a great choice for starting in data science because it's built for it. It's not just a programming language, it is a whole system with tools and libraries made to help you think and work like a data scientist easily.
We assume a basic knowledge of R and coding languages for these toolkits. For most of the tutorials in this toolkit, you’ll need to have R and RStudio downloaded and installed on your system. You should be able to install packages, know how to find the address to a folder on your computer system, and have very basic familiarity with R.
### Tutorials for R {-}
If you are new to R, we recommend the following [intro-level tutorials](https://learn.datacamp.com/courses/free-introduction-to-r) provided through [installation guides](https://rspatial.org/intr/1-introduction.html). You can also refer to this [R for Social Scientists](https://datacarpentry.org/r-socialsci/) tutorial developed by Data Carpentry for a refresher.
You can also visit the [RStudio Education](https://education.rstudio.com/learn/) page to select a learning path tailored to your experience level ([Beginners](https://education.rstudio.com/learn/beginner/), [Intermediates](https://education.rstudio.com/learn/intermediate/), [Experts](https://education.rstudio.com/learn/expert/)). They offer detailed instructions to learners at different stages of their R journey.
## Environmental Setup
Getting started with data analysis in R involves a few preliminary steps, including downloading datasets and setting up a working directory. This introduction will guide you through these essential steps to ensure a smooth start to your data analysis journey in R.
:::tools
**Download the Activity Datasets**
Please download and unzip this file to get started: [SDOHPlace-DataWrangling.zip](https://github.com/healthyregions/sdohplace-toolkit/raw/main/data/SDOHPlace-DataWrangling.zip)
:::
### Setting Up the Working Directory {-}
Setting up a working directory in R is crucial as it defines the location on your computer where your files and scripts will be saved and accessed. You can set the working directory to any folder on your system where you plan to store your datasets and R scripts. To set your working directory, use the `setwd("/path/to/your/directory")` and specify the path to your desired directory.
### Installing & Working with R Libraries {-}
Before starting operations related to spatial data, we need to complete an environmental setup. This workshop requires several packages, which can be installed from CRAN:
- `sf`: simplifies spatial data manipulation
- `tmap`: streamlines thematic map creation
- `dplyr`: facilitates data manipulation
- `tidygeocoder`: converts addresses to coordinates easily
Uncomment to install packages with code snippet below. You only need to install packages once in an R environment.
```{r eval=FALSE}
#install.packages("sf", "tmap", "tidygeocoder", "dplyr")
```
:::tip
**Installation Tip**
For Mac users, check out https://github.com/r-spatial/sf for additional tips if you run into errors when installing the `sf` package. Using homebrew to install `gdal` usually fixes any remaining issues.
:::
Now, loading the required libraries for further steps:
```{r load-libraries, warning=FALSE, message=FALSE}
library(sf)
library(dplyr)
library(tmap)
```
## Intro to Spatial Data
Spatial data analysis in R provides a robust framework for understanding geographical information, enabling users to explore, visualize, and model spatial relationships directly within their data. Through the integration of specialized packages like sf for spatial data manipulation, ggplot2 and tmap for advanced mapping, and tidygeocoder for geocoding, R becomes a powerful tool for geographic data science. This ecosystem allows researchers and analysts to uncover spatial patterns, analyze geographic trends, and produce detailed maps that convey complex information intuitively.
### Load Spatial Data {-}
We need to load the spatial data (shapefile). Remember, this type of data is actually comprised of multiple files. All need to be present in order to read correctly. Let's use **chicagotracts.shp** for practice, which includes the census tracts boundary in Chicago.
First, we need to read the shapefile data from where you save it.
```{r eval=FALSE}
Chi_tracts = st_read("SDOHPlace-DataWrangling/chicagotracts.shp")
```
Your output will look something like:
```
## Reading layer `chicagotracts' from data source `./SDOHPlace-DataWrangling/chicagotracts.shp' using driver `ESRI Shapefile'
## Simple feature collection with 801 features and 9 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -87.94025 ymin: 41.64429 xmax: -87.52366 ymax: 42.02392
## Geodetic CRS: WGS 84
```
Always inspect data when loading in. Let's look at a non-spatial view.
```{r non-spatial-view}
head(Chi_tracts)
```
Check out the data structure of this file.
```{r data-structure}
str(Chi_tracts)
```
The data is no longer a shapefile but an `sf` object, comprised of polygons. The `plot()` command in R help to quickly visualizes the geometric shapes of Chicago's census tracts. The output includes multiple maps because the `sf` framework enables previews of each attribute in our spatial file.
```{r spatial-view}
plot(Chi_tracts)
```
### Adding a Basemap
Then, we can use `tmap`, a mapping library, in interactive mode to add a basemap layer. It plots the spatial data from `Chi_tracts`, applies a minimal theme for clarity, and labels the map with a title, offering a straightforward visualization of the area's census tracts.
We stylize the borders of the tract boundaries by making it transparent at 50% (which is equal to an alpha level of 0.5).
```{r tmap-view}
library(tmap)
tmap_mode("view")
tm_shape(Chi_tracts) + tm_borders(alpha=0.5) +
tm_layout(title = "Census Tract Map of Chicago")
```
Still in the interactive mode (`view'), we can switch to a different basemap. Here we bring in a "Voyager" style map from Carto, a cartographic firm. We'll make the borders less transparent by adjusting the alpha level.
```{r add-basemap}
tmap_mode("view")
tm_basemap("CartoDB.Voyager") +
tm_shape(Chi_tracts) + tm_borders(alpha=0.8, col = "gray40") +
tm_layout(title = "Census Tract Map of Chicago")
```
::: tip
**Tip**
For additional options, you can preview basemaps at the [Leaflet Providers Demo](https://leaflet-extras.github.io/leaflet-providers/preview/). Some basemaps we recommend that work consistently are by:
- CartoDB (Carto)
- Open Street Map
- ESRI
Not all basemaps are available anymore, and some require keys that you'd need to add on your own.
:::
## Coordinate Reference Systems
For this exercise we will use `chicagotracts.shp` to explore how to change the projection of a spatial dataset in R. First, let's check out the current coordinate reference system.
```{r inspect-crs}
st_crs(Chi_tracts)
```
We can use the `st_transform` function to transform CRS. When projecting a dataset of Illinois, the most appropriate NAD83 projection would be NAD83 UTM zone 16N. Chicago sits within the area best covered by **NAD83 / Illinois East (ftUS)** (EPSG:3435).
```{r }
Chi_tracts.3435 <- st_transform(Chi_tracts, "EPSG:3435")
# Chi_tracts.3435 <- st_transform(Chi_tracts, 3435)
st_crs(Chi_tracts.3435)
```
After change the projection, we can plot the map. We'll swith to the static version of tmap, using the `plot` mode.
```{r}
tmap_mode("plot")
tm_shape(Chi_tracts.3435) + tm_borders(alpha=0.5) +
tm_layout(main.title ="EPSG:3435 (ft)",
main.title.position = "center")
```
What if we had used the wrong EPSG code, referencing the wrong projection? Here we'll transform and plot EPSG Code 3561, a coordinate reference system in Haw'aii.
```{r transform-3561}
Chi_tracts.3561 <- st_transform(Chi_tracts, "EPSG:3561")
tm_shape(Chi_tracts.3561) + tm_borders(alpha=0.5) +
tm_layout(main.title ="EPSG:3561 (ft)",
main.title.position = "center")
```
It's obviously not correct -- the wrong CRS can cause a load of trouble. Make sure you specify carefully!
### Refine Basic Map {-}
Let's take a deeper look at the cartographic mapping package, `tmap`. We approach mapping with one layer at a time. Always start with the object you want to map by calling it with the `tm_shape` function. Then, at least one descriptive/styling function follows. There are hundreds of variations and paramater specification.
Here we style the tracts with some semi-transparent borders.
```{r }
library(tmap)
tm_shape(Chi_tracts) + tm_borders(alpha=0.5)
```
Next we fill the tracts with a light gray, and adjust the color and transparency of borders. We also add a scale bar, positioning it to the left and having a thickness of 0.8 units, and turn off the frame.
```{r }
tm_shape(Chi_tracts) + tm_fill(col = "gray90") + tm_borders(alpha=0.2, col = "gray10") +
tm_scale_bar(position = ("left"), lwd = 0.8) +
tm_layout(frame = F)
```
Check out https://rdrr.io/cran/tmap/man/tm_polygons.html for more ideas.
## Converting to Spatial Data
### Convert CSVs to Spatial Data
We are using the `Affordable_Rental_Housing_Developments.csv` in the dataset to show how to convert a csv Lat/Long data to points. First, we need to load the CSV data.
```{r load-csv-data}
housing = read.csv("SDOHPlace-DataWrangling/Affordable_Rental_Housing_Developments.csv")
```
Then, we need to ensure that no column (intended to be used as a coordinate) is entirely empty or filled with NA values.
```{r }
cleaned_housing <- na.omit(housing)
```
Inspect the data to confirm it's doing what you expect it to be doing. What columns will you use to specify the coordinates? In this dataset, we have multiple coordinate options. We'll use latitude and longitude, or rather, longitude as our X value, and latitude as our Y value.
In the data, it's specified as "Longitude" and "Latitude."
```{r }
head(cleaned_housing)
```
Finally, we start to convert it to points. Be sure you use the CRS of the original coordinates recorded. In this case we weren't sure what CRS that was, so we use `EPSG:4326` to test.
```{r }
points_housing <- st_as_sf(cleaned_housing, coords = c("Longitude", "Latitude"), crs = 4326)
```
View the resulting sf object with a basemap to confirm they are in the right place. Overlay them on top of the tract data, to confirm they are plotting correctly.
```{r }
### First Layer
tm_shape(Chi_tracts) + tm_borders(lwd = 0.5) +
### Second Layer
tm_shape(points_housing) + tm_dots(size = 0.1 )
```
You can change the `tmap_mode` to "view" to add a basemap in an interactive setting, and then switch back to "plot" when complete. Because we'e plotting dots using *tmap*, we'll use the `tm_dots` parameter for styling.
```{r }
tmap_mode("view")
tm_shape(Chi_tracts) + tm_borders(lwd = 0.5) +
tm_shape(points_housing) + tm_dots(size = 0.01)
tmap_mode("plot")
```
We'll reproject to `EPSG:3435`, our system standard
```{r }
housing.3435 <- st_transform(points_housing, "EPSG:3435")
```
#### Write Spatial Data
Finally, we can save our points as a spatial dataset. Use 'st_write' to write your spatial object in R to a data format of your choice. Here, we'll write to a geojson file.
Uncomment to run this line.
```{r }
#st_write(housing.3435, "housing.geojson", driver = "GeoJSON")
```
You could also save as a "housing.shapefile" to get a shapefile format, however you'll get an error noting that some column names are too long and must be shortened. Shapefile formats have a limit of 10 characters for field names.
```{r }
#st_write(housing.3435, "housing.shp", driver = "ESRI Shapefile")
```
The file may still write, but the column names that were too long may be shortened automatically.
To change column or field names in R objects, there are dozens of options. Try searching and "googling" different search terms to identify solutions on your own.
### Geocode Addresses
Here, we will use `chicago_methadone_nogeometry.csv` for practice, which includes methadone centers in Chicago (center names and addresses). First we load the `tidygeocoder` to get our geocoding done.
```{r warning=FALSE}
library(tidygeocoder)
```
Let's read in and inspect data for methadone maintenance providers. Note, these addresses were made available by SAMSHA, and are known as publicly available information. An additional analysis could call each service to check on access to medication during COVID in Septmber 2020, and the list would be updated further.
```{r}
methadoneClinics <- read.csv("SDOHPlace-DataWrangling/chicago_methadone_nogeometry.csv")
head(methadoneClinics)
```
Let's geocode one address first, just to make sure our system is working. We'll use the "cascade" method which use the US Census and OpenStreetMap geocoders. These two services are the main options with `tidygeocoder`.
```{r}
sample <- geo("2260 N. Elston Ave. Chicago, IL", lat = latitude, long = longitude, method = 'cascade')
head(sample)
```
As we prepare for geocoding, check out the structure of the dataset. The data should be a character to be read properly.
```{r}
str(methadoneClinics)
```
We need to clean the data a bit. We'll add a new column for a full address, as required by the geocoding service. When you use a geocoding service, be sure to read the documentation and understand how the data needs to be formatted for input.
```{r}
methadoneClinics$fullAdd <- paste(as.character(methadoneClinics$Address),
as.character(methadoneClinics$City),
as.character(methadoneClinics$State),
as.character(methadoneClinics$Zip))
```
We're ready to go! Batch geocode with one function, and inspect:
```{r}
geoCodedClinics <- geocode(methadoneClinics,
address = 'fullAdd', lat = latitude, long = longitude, method = 'cascade')
head(geoCodedClinics)
```
There were two that didn't geocode correctly. You can inspect further. This could involve a quick check for spelling issues; or, searching the address and pulling the lat/long using Google Maps and inputting manually. Or, if we are concerned it's a human or unknown error, we could omit. For this exercise we'll just omit the two clinics that didn't geocode correctly.
```{r}
geoCodedClinics2 <- na.omit(geoCodedClinics)
```
## Convert to Spatial Data
This is not spatial data yet! To convert a static file to spatial data, we use the powerful `st_as_sf` function from `sf`. Indicate the x,y parameters (=longitude, latitude) and the coordinate reference system used. Our geocoding service used the standard **EPSG:4326**, so we input that here.
```{r warning = FALSE}
methadoneSf <- st_as_sf(geoCodedClinics2,
coords = c( "longitude", "latitude"),
crs = 4326)
```
### Basic Map of Points {-}
For a really simple map of points -- to ensure they were geocoded and converted to spatial data correctly, we use `tmap`. We'll use the interactive version to view.
```{r warning = FALSE, message=FALSE}
tmap_mode("view")
tm_shape(methadoneSf) + tm_dots()
```
If your points didn't plot correctly:
- Did you flip the longitude/latitude values?
- Did you input the correct CRS?
Those two issues are the most common errors.
## Merge Data sets
### Reshape Data {-}
Here, we are trying to use the `COVID-19_Cases__Tests__and_Deaths_by_ZIP_Code.csv` dataset to practice how to convert long data to a wide data format.
We subset to the first two columns, and the sixth column. That gives us the zip code, the reporting week, and cumulative cases of Covid-19. We want each zip code to be a unique row, with cases by week as a columns. Choose whatever subset function you prefer best!
```{r}
covid = read.csv("SDOHPlace-DataWrangling/COVID-19_Cases__Tests__and_Deaths_by_ZIP_Code.csv")
covid_clean = covid[,c(1:2, 6)]
head(covid_clean)
```
Now, we are trying to create a wide data set with the cumulative cases for each week for each zip code. Enter the code and you will see the new wide data format.
```{r}
covid_wide <- reshape(covid_clean, direction = "wide",
idvar = "ZIP.Code", timevar = "Week.Number")
head(covid_wide)
```
### Join by Attribute {-}
Here, we’ll merge data sets with a common variable in R. Merging the cumulative case data set you created in the last section to zip code spatial data (`ChiZipMaster1.geojson`) will allow you to map the case data. You’ll be merging the case data and spatial data using the zip codes field of each dataset.
We've cleaned our covid case data already, but not all values under the zip code column are valid. There is a row has a value of "unknown", so let's remove that.
```{r}
covid_wide_clean <- covid_wide %>%
filter(ZIP.Code != "unknown" & !is.na(ZIP.Code))
```
Then, we need to load the zip code data.
```{r eval=FALSE}
zipcode <- st_read("SDOHPlace-DataWrangling/ChiZipMaster1.geojson")
```
Your output will look something like:
```
## Reading layer `ChiZipMaster1' from data source `./SDOHPlace-DataWrangling/ChiZipMaster1.geojson' using driver `GeoJSON'
## Simple feature collection with 540 features and 31 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -87.87596 ymin: 41.64454 xmax: -87.52414 ymax: 42.02304
## Geodetic CRS: WGS 84
```
You’ll notice that the zip codes are repeated in the zip code data set, and needs to be cleaned before we can continue with merging the data.
```{r}
zipcode_unique <- distinct(zipcode)
zipcode_unique <- zipcode %>%
group_by(zip) %>%
slice(1) %>%
ungroup()
```
Now, the two datasets are ready to join together by the zip code. Make sure to check they have been joined successully.
We'll have these joined datasets be called Chi_Zipsf, to denote a final zip code master dataset.
```{r}
Chi_Zipsf <- zipcode_unique %>%
left_join(covid_wide_clean, by = c("zip" = "ZIP.Code"))
```
We'll reproject to EPSG:3435, the standard used in our study area.
```{r}
Chi_Zipsf.3435 <- st_transform(Chi_Zipsf, 3435)
```
### Join by Location {-}
We’ll create a spatial join with the housing and zip code data we've brought in.
In this example, we want to join zip-level data to the Rental Housing Developments, so we can identify which zips they are within.
First, let's try "sticking" all the zip code data too the housing points, intersecting zip codes with housing developments.
To do this, both datasets will need to be in the same CRS. We have already standardized both using EPSG:3435.
```{r}
Housing.f <- st_join(housing.3435, Chi_Zipsf.3435, join = st_intersects)
```
Don't forget to inspect the data. Uncomment to explore!
```{r}
#head(Housing.f)
```
We could also flip things around, and try to count how many developments intersect each zip code. We can use lengths() to find out how many items are present in a vector. Here,
```{r}
Chi_Zipsf.3435$TotHousing <- lengths(st_intersects(Chi_Zipsf.3435, housing.3435))
head(Chi_Zipsf.3435)
```
## Inspect Data
### Thematic Maps
To inspect data from a spatial perspective, we can create a series of choropleth maps.
#### Example 1. Number of Affordable Housing Developments per Zip Code {-}
Choose the variable "TotHousing" to map total developments per zip coode, as we calculated previously. Here we'll map using Jenks data breaks, with a Blue to Purple palette, and four bins (or groups of data to be plotted). A histogram of the data is plotted, visualizing where breaks occured in the data to generate the map.
```{r, message=FALSE, warning=FALSE}
tmap_mode("plot")
tm_shape(Chi_Zipsf.3435) + tm_polygons("TotHousing", legend.hist = TRUE, style="jenks", pal="BuPu", n=4, title = "Housing Dev.") +
tm_layout(legend.outside = TRUE, legend.outside.position = "right")
```
#### Example 2. Number of COVID-19 Cases per Zip Code {-}
Let's do the same, but plut a different variable. Select a different variable name as your parameter in the 'tm_fill' parameter.
```{r, message=FALSE, warning=FALSE}
tm_shape(Chi_Zipsf.3435) + tm_polygons("Case.Rate...Cumulative",
legend.hist = TRUE, style="jenks",
pal="BuPu", n=4, title = "COVID Case Rate") +
tm_layout(legend.outside = TRUE, legend.outside.position = "right")
```
### Map Overlay
#### Example 1. Afforfable Housing Developments & Zip Code Boundaries {-}
We previously translated the housing dataset from a CSV to a spatial object. Let's take an attribute connect with each point, the total number of units per housing development, and visualize as a graduated symbology. Points with more units will be bigger, and not all places are weighted the same visually.
We use the "style" parameter to add a standard deviation data classification break. Check out tmap documentation for more options, like quantiles, natural jenks, or other options.
```{r}
tm_shape(housing.3435) + tm_bubbles("Units", col = "purple", style = "sd")
```
Then, let's overlay that layer to the zip code boundary.
```{r}
tm_shape(Chi_Zipsf.3435) + tm_polygons(col = "gray80") +
tm_shape(housing.3435) + tm_bubbles("Units", col = "purple")
```
You can also color-code according to the total number of units. Here, we'll add a palette using a "viridis" color scheme, as a graduate color point map. For extra style, we'll add labels to each zip code, update with a new basemap, and make the whole map interactive.
```{r}
tmap_mode("view")
#Add Basemap
tm_basemap("Esri.WorldGrayCanvas") +
#Add First Layer, Style
tm_shape(Chi_Zipsf.3435) + tm_borders(col = "gray10") +
tm_text("zip", size = 0.7) +
#Add Second Layer, Style
tm_shape(housing.3435) +
tm_bubbles( col = "Units", style = "quantile",
pal = "viridis", size = 0.1)
```
#### Example 2. COVID-19 & Methadone {-}
In the first example, let create a map showing both COVID-19 and methadone clinic data (used in A.3). First, let's add our zip code map.
With this overlay, we'll add a "hack" to include the methadone clinic points in a legend.
```{r}
tmap_mode("plot")
##Add and style First Layer
tm_shape(Chi_Zipsf) + tm_polygons("Case.Rate...Cumulative",
style="jenks", pal="BuPu", n=4, title = "COVID Rt") +
##Add Second Layer
tm_shape(methadoneSf) + tm_dots(size = 0.2, col = "gray20") +
##"Hack" a manual symbology for dots in the legend
tm_add_legend("symbol", col = "gray20", size = .2, labels = "Methadone MOUD") +
##Final Cartographic Styling
tm_layout(legend.outside = TRUE, legend.outside.position = "right")
```
## Resources {-}
For a deeper dive on topics discussed in this chapter, please check out the following. If you have a resource to add, feel free to suggest one by submitting an [issue](https://github.com/healthyregions/sdhoplace-toolkit/issues) to our toolkit repository.
- For tips on using `tmap`, check out the online text, [Elegant and informative maps with tmap](https://r-tmap.github.io/tmap-book/index.html) by Tennekes and Nowosad.
- Try out more mapping with the ggplot2 library. The [Maps](https://ggplot2-book.org/maps) chapter will give you a head start.
- We highly recommend Chapters 3-5 as mandatory reading in this classic, [Geocomputation with R](https://r.geocompx.org/) by Lovelace, Nowosad, and Muenchow. Perfecting selections and filters in the *Attribute Data Operations* chapter will help you become a data wrangling master. Perfect distance metrics and essential GIS operations in subsequent chapters.
- The Appendix in [Gimond's Intro to GIS](https://mgimond.github.io/Spatial/introGIS.html) online book has a super overview of R examples, not to be missed.
- Another superb resource is [Analyzing US Census Data](https://walker-data.com/census-r/index.html) by Kyle Walker, with some of our favorite examples of extracing & reshaping data directly from the Census using code. Highly recommended!