Skip to content

Commit bda5cef

Browse files
authored
Merge pull request #82 from NIEHS/dariusmb_0219
Dariusmb 04/29
2 parents fd178a1 + 119b41f commit bda5cef

File tree

2 files changed

+95
-2
lines changed

2 files changed

+95
-2
lines changed

chapters/05-01-hcup-amadeus-usecase.Rmd

Lines changed: 95 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
### Integrating HCUP databases with Amadeus Exposure data {.unnumbered}
66

7-
**Date Modified**: April 18, 2025
7+
**Date Modified**: April 29, 2025
88

99
**Author**: Darius M. Bost
1010

@@ -415,9 +415,102 @@ ggplot(smoke_summary, aes(x = factor(has_asthma), y = avg_heavy,
415415
```
416416

417417
::: figure
418-
<img src="images/hcup_amadeus_usecase/asthma_vs_heavy_smoke.png"style="width:100%"/>
418+
<img src="images/hcup_amadeus_usecase/asthma_vs_heavy_smoke.png" style="width:100%"/>
419419
:::
420420

421+
### Bivariate Map Analysis: Asthma and Heavy Smoke Exposure
422+
423+
The code below producess a bivariate map for spatial analysis of asthma prevalence and heavy smoke exposure across ZIP codes in Oregon. Using hospital discharge data HCUP and NOAA's HMS smoke plume data, ZIP-level asthma rates were calculated and paired with the average number of heavy smoke days over the same time period.
424+
425+
```{r bivariate-map, message=FALSE, warning=FALSE, eval= FALSE}
426+
427+
# Load additional libraries
428+
library(biscale)
429+
library(cowplot)
430+
431+
# Recall we have identified diagnosis columns
432+
head(diag_columns)
433+
434+
# Recall our smoke summary dataframe
435+
head(smoke_summary)
436+
437+
# Summarize asthma rate and heavy smoke exposure by ZIP code
438+
zip_summary <- smoke_summary %>%
439+
group_by(ZIP) %>%
440+
summarize(
441+
asthma_rate = mean(has_asthma, na.rm = TRUE),
442+
avg_heavy_smoke = mean(sum_heavy, na.rm = TRUE)
443+
)
444+
445+
# Recall ZIP code shapefile (ZCTA) for Oregon
446+
head(or)
447+
448+
# Join spatial and summary data
449+
map_data <- left_join(or, zip_summary, by = c("ZCTA5CE10" = "ZIP"))
450+
451+
# Filter out missing values
452+
map_data_clean <- map_data %>%
453+
filter(!is.na(asthma_rate) & !is.na(avg_heavy_smoke))
454+
455+
# Apply bivariate classification
456+
map_data_clean <- bi_class(map_data_clean, x = asthma_rate,
457+
y = avg_heavy_smoke, style = "quantile", dim = 3)
458+
459+
# Create the main map
460+
map <- ggplot() +
461+
theme_void(base_size = 14) +
462+
geom_sf(data = map_data_clean, aes(fill = bi_class), color = "white",
463+
size = 0.1, show.legend = FALSE) +
464+
bi_scale_fill(pal = "GrPink", dim = 3) +
465+
labs(
466+
title = "Asthma Prevalence vs Heavy Smoke Exposure by ZIP Code",
467+
subtitle = "Bivariate map showing intersection of health and environmental
468+
burden",
469+
caption = "Source: HCUP-Amadeus & NOAA HMS Smoke Data"
470+
) +
471+
theme(
472+
plot.title = element_text(hjust = 0.5, face = "bold"),
473+
plot.subtitle = element_text(hjust = 0.5),
474+
plot.caption = element_text(size = 10, face = "italic", hjust = 1),
475+
plot.margin = margin(10, 20, 10, 20)
476+
)
477+
478+
# Create the legend
479+
legend <- bi_legend(
480+
pal = "GrPink",
481+
dim = 3,
482+
xlab = "Higher Smoke",
483+
ylab = "Higher Asthma",
484+
size = 10,
485+
flip_axes = FALSE,
486+
rotate_pal = FALSE
487+
)
488+
489+
# Combine map and legend
490+
final_plot <- ggdraw() +
491+
draw_plot(map, 0, 0, 1, 1) +
492+
draw_plot(legend, 0.77, 0.05, 0.2, 0.2)
493+
494+
# Display the final plot
495+
final_plot
496+
```
497+
498+
::: figure
499+
<img src="images/hcup_amadeus_usecase/asthmaVsSmokeBivariate.png" style="width:100%"/>
500+
:::
501+
502+
Each ZIP code is shaded based on the intersection of these two variables using a 3x3 quantile classification. The bivariate color scale in the legend shows increasing smoke exposure along the x-axis (red) and increasing asthma prevalence along the y-axis (blue):
503+
504+
Dark red areas: High smoke exposure, low asthma prevalence
505+
506+
Dark blue areas: High asthma prevalence, low smoke exposure
507+
508+
Dark purple areas: High asthma and high smoke — indicating areas with compounded health and environmental burdens
509+
510+
Light gray areas: Low on both dimensions
511+
512+
This bivariate map helps identify regions where environmental and health vulnerabilities intersect and can inform targeted public health responses.
513+
421514
### Logistic Regression Analysis
422515

423516
Finally, we fit a logistic regression model to examine the relationship between asthma diagnoses and exposure to different levels of smoke density.
Loading

0 commit comments

Comments
 (0)