|
4 | 4 |
|
5 | 5 | ### Integrating HCUP databases with Amadeus Exposure data {.unnumbered}
|
6 | 6 |
|
7 |
| -**Date Modified**: April 18, 2025 |
| 7 | +**Date Modified**: April 29, 2025 |
8 | 8 |
|
9 | 9 | **Author**: Darius M. Bost
|
10 | 10 |
|
@@ -415,9 +415,102 @@ ggplot(smoke_summary, aes(x = factor(has_asthma), y = avg_heavy,
|
415 | 415 | ```
|
416 | 416 |
|
417 | 417 | ::: figure
|
418 |
| -<img src="images/hcup_amadeus_usecase/asthma_vs_heavy_smoke.png"style="width:100%"/> |
| 418 | +<img src="images/hcup_amadeus_usecase/asthma_vs_heavy_smoke.png" style="width:100%"/> |
419 | 419 | :::
|
420 | 420 |
|
| 421 | +### Bivariate Map Analysis: Asthma and Heavy Smoke Exposure |
| 422 | + |
| 423 | +The code below producess a bivariate map for spatial analysis of asthma prevalence and heavy smoke exposure across ZIP codes in Oregon. Using hospital discharge data HCUP and NOAA's HMS smoke plume data, ZIP-level asthma rates were calculated and paired with the average number of heavy smoke days over the same time period. |
| 424 | + |
| 425 | +```{r bivariate-map, message=FALSE, warning=FALSE, eval= FALSE} |
| 426 | +
|
| 427 | +# Load additional libraries |
| 428 | +library(biscale) |
| 429 | +library(cowplot) |
| 430 | +
|
| 431 | +# Recall we have identified diagnosis columns |
| 432 | +head(diag_columns) |
| 433 | +
|
| 434 | +# Recall our smoke summary dataframe |
| 435 | +head(smoke_summary) |
| 436 | +
|
| 437 | +# Summarize asthma rate and heavy smoke exposure by ZIP code |
| 438 | +zip_summary <- smoke_summary %>% |
| 439 | + group_by(ZIP) %>% |
| 440 | + summarize( |
| 441 | + asthma_rate = mean(has_asthma, na.rm = TRUE), |
| 442 | + avg_heavy_smoke = mean(sum_heavy, na.rm = TRUE) |
| 443 | + ) |
| 444 | +
|
| 445 | +# Recall ZIP code shapefile (ZCTA) for Oregon |
| 446 | +head(or) |
| 447 | +
|
| 448 | +# Join spatial and summary data |
| 449 | +map_data <- left_join(or, zip_summary, by = c("ZCTA5CE10" = "ZIP")) |
| 450 | +
|
| 451 | +# Filter out missing values |
| 452 | +map_data_clean <- map_data %>% |
| 453 | + filter(!is.na(asthma_rate) & !is.na(avg_heavy_smoke)) |
| 454 | +
|
| 455 | +# Apply bivariate classification |
| 456 | +map_data_clean <- bi_class(map_data_clean, x = asthma_rate, |
| 457 | + y = avg_heavy_smoke, style = "quantile", dim = 3) |
| 458 | +
|
| 459 | +# Create the main map |
| 460 | +map <- ggplot() + |
| 461 | + theme_void(base_size = 14) + |
| 462 | + geom_sf(data = map_data_clean, aes(fill = bi_class), color = "white", |
| 463 | + size = 0.1, show.legend = FALSE) + |
| 464 | + bi_scale_fill(pal = "GrPink", dim = 3) + |
| 465 | + labs( |
| 466 | + title = "Asthma Prevalence vs Heavy Smoke Exposure by ZIP Code", |
| 467 | + subtitle = "Bivariate map showing intersection of health and environmental |
| 468 | + burden", |
| 469 | + caption = "Source: HCUP-Amadeus & NOAA HMS Smoke Data" |
| 470 | + ) + |
| 471 | + theme( |
| 472 | + plot.title = element_text(hjust = 0.5, face = "bold"), |
| 473 | + plot.subtitle = element_text(hjust = 0.5), |
| 474 | + plot.caption = element_text(size = 10, face = "italic", hjust = 1), |
| 475 | + plot.margin = margin(10, 20, 10, 20) |
| 476 | + ) |
| 477 | +
|
| 478 | +# Create the legend |
| 479 | +legend <- bi_legend( |
| 480 | + pal = "GrPink", |
| 481 | + dim = 3, |
| 482 | + xlab = "Higher Smoke", |
| 483 | + ylab = "Higher Asthma", |
| 484 | + size = 10, |
| 485 | + flip_axes = FALSE, |
| 486 | + rotate_pal = FALSE |
| 487 | +) |
| 488 | +
|
| 489 | +# Combine map and legend |
| 490 | +final_plot <- ggdraw() + |
| 491 | + draw_plot(map, 0, 0, 1, 1) + |
| 492 | + draw_plot(legend, 0.77, 0.05, 0.2, 0.2) |
| 493 | +
|
| 494 | +# Display the final plot |
| 495 | +final_plot |
| 496 | +``` |
| 497 | + |
| 498 | +::: figure |
| 499 | +<img src="images/hcup_amadeus_usecase/asthmaVsSmokeBivariate.png" style="width:100%"/> |
| 500 | +::: |
| 501 | + |
| 502 | +Each ZIP code is shaded based on the intersection of these two variables using a 3x3 quantile classification. The bivariate color scale in the legend shows increasing smoke exposure along the x-axis (red) and increasing asthma prevalence along the y-axis (blue): |
| 503 | + |
| 504 | +Dark red areas: High smoke exposure, low asthma prevalence |
| 505 | + |
| 506 | +Dark blue areas: High asthma prevalence, low smoke exposure |
| 507 | + |
| 508 | +Dark purple areas: High asthma and high smoke — indicating areas with compounded health and environmental burdens |
| 509 | + |
| 510 | +Light gray areas: Low on both dimensions |
| 511 | + |
| 512 | +This bivariate map helps identify regions where environmental and health vulnerabilities intersect and can inform targeted public health responses. |
| 513 | + |
421 | 514 | ### Logistic Regression Analysis
|
422 | 515 |
|
423 | 516 | Finally, we fit a logistic regression model to examine the relationship between asthma diagnoses and exposure to different levels of smoke density.
|
|
0 commit comments