DRD_in_Scotland_Visualisations.Rmd

---
output:
  html_document: default
  pdf_document: default
jupyter:
  kernelspec:
    display_name: R
    language: R
    name: ir
  language_info:
    codemirror_mode: r
    file_extension: .r
    mimetype: text/x-r-source
    name: R
    pygments_lexer: r
    version: 4.1.0
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)

library(readxl)
library(reshape2)
library(ggplot2)
library(dplyr)
library(forcats)
library(tidyr)
library(sf)
library(ggpubr)
library(scales)

# default region colours
reg.colours = c("#4D4D4D", "#AEAEAE")

# The colour-blind palette with grey:
myPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7", "#000000")

current_year = 2021
previous_year = current_year -1
# breaks for y-axis line graphs
ybreaks = seq(2000,current_year,2)

# read data - download first as read_excel() can't do URLs
xlfile = tempfile(fileext = '.xlsx')
download.file("https://www.nrscotland.gov.uk/files//statistics/drug-related-deaths/21/drug-related-deaths-21-tabs-figs.xlsx", xlfile, mode="wb")
xlfile2 = tempfile(fileext = '.xlsx')
download.file("https://www.nrscotland.gov.uk/files//statistics/drug-related-deaths/2018/drd-comparison-with-other-countries-2016-2018.xlsx", xlfile2, mode="wb")

```

# Drug-Related Deaths in Scotland `r current_year` {.tabset}

Annual data is published on drug-related deaths (DRDs) in Scotland by the [National Records of Scotland](https://www.nrscotland.gov.uk/statistics-and-data/statistics/statistics-by-theme/vital-events/deaths/drug-related-deaths-in-scotland). The 2021 data was published on 28th July 2022.

A collection of visualisations of the [`r current_year` data](https://www.nrscotland.gov.uk/files//statistics/drug-related-deaths/20/drug-related-deaths-21-tabs-figs.xlsx) are presented here for those interested in exploring the trends or patterns in the data. Little to no commentary is provided on what the data show.

Each table from the data are presented separately by one or more charts. Not all tables are represented here mostly because the data has already been presented in the raw data already.

All analyses and code are available on [github](https://github.com/drchriscole/drugdeathsscotland) for others to use. Comments and suggestion welcome.

## Table 2

```{r t2}
# read data
table2.dat = read_excel(xlfile, sheet = '2 - causes', col_names=FALSE)

# extract core data and convert to numbers
dat = table2.dat[22:32,c(1,3:7)]
dat = data.frame(apply(dat, 2, as.numeric))
dat[is.na(dat)] <- 0

# add colnames
colnames(dat) <- c('Year','Drug Abuse','Accidental Poisoning','Suicide','Assault by Drugs','Undetermined')

# factorise the years to turn into a label and melt
dat$Year <- factor(dat$Year)
dat.m = melt(dat)

year.totals = dat.m %>% group_by(Year) %>% summarise(Total = sum(value))

ggplot(dat.m, aes(x=Year, y=value, fill=variable)) +
  labs(title = 'Drug-Related Deaths in Scotland',
       subtitle = "Underlying cause of death",
       x = '',
       y = 'No. Deaths') +
  geom_col() +
  geom_text(aes(x=Year, y=Total+28, label = Total, fill = NULL), data=year.totals, size=2.8) +
  scale_fill_manual(values = myPalette) +
  theme_minimal() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        legend.title = element_blank())

#ggsave('DRD_cause.png', bg = 'white')
```

## Table 3

```{r t3, fig.width=7}
# read data
table.dat = read_excel(xlfile, sheet = '3 - drugs reported', col_names=FALSE)

# extract core data and convert to numbers
dat = table.dat[14:35,c(1,3:18)]
dat = data.frame(apply(dat, 2, as.numeric))

colnames(dat) <- c('Year', 'Any opiate', 'Heroin.morphine', 'Methadone','Buprenorphin','Codeine', 'Dihydrocodeine', 'Anybenzo', 'Presbenzo', 'Diazepam', 'Streetbenzo', 'Etizolam', 'GabaPre', 'Cocaine', 'Ecstasy', 'Amphetamines', 'Alcohol')

# combine the codeine/dihydrocodeine cols and remove
dat = data.frame(dat, Codeines = dat$Codeine + dat$Dihydrocodeine)

# subtract the diazepam data from
# the any prescsribed benzo data to make an "other pres benzo" column
dat = data.frame(dat, "Otherpbenzo"=dat$Presbenzo - dat$Diazepam)

# similarly subtract etz for street benzo to
# give other street benzo
dat = data.frame(dat, "Othersbenzo"=dat$Streetbenzo - dat$Etizolam)

# remove uninterest columns:
#   heroin or OST and any opiate are not very informative
#   Codeine/dihydrocodeine unnecessary now as they've been combined
#   any benzo is superceded by other benzos
# individual cases)
dat = dat[,c(1,3:5,8:20)]

opiates = dat[,names(dat) %in% c('Year','Heroin.morphine','Methadone','Buprenorphin','Diazepam','Etizolam', 'Codeines','Otherpbenzo','Othersbenzo')]
others = dat[,names(dat) %in% c('Year','GabaPre','Cocaine','Ecstasy','Amphetamines','Alcohol')]

top.drugs = dat[,names(dat) %in% c('Year','GabaPre','Cocaine','Heroin.morphine','Methadone','Etizolam')]

# factorise the years to turn into a label and melt
opiates$Year <- factor(opiates$Year)
opiates.m = melt(opiates)

ggplot(opiates.m, aes(x=Year, y=value, group=variable, colour=variable)) +
  labs(title = "Drugs Implicated in Death", 
       subtitle = 'Opiates and benzodiazepines',
       x = '',
       y = "No. of Deaths") +
#  labs(caption = "Data prior to 2008 uses a different definition than the data from 2008.") +
  ylim(c(0,850)) +
  geom_vline(xintercept=which(opiates$Year == '2008'), colour="grey", linetype="dashed") +
  geom_line(size=1.1) +
  scale_color_manual(values = myPalette, labels = c('Heroin or Morphine','Methadone','Buprenorphin','Diazepam','Etizolam','Codeines','Other prescribed \nbenzodiazepines','Other street\nbenzodiazipnes')) +
  scale_x_discrete(breaks = ybreaks) +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        legend.title = element_blank())

#ggsave("DRD_implicated_drug1.png", bg = 'white')


# factorise the years to turn into a label and melt
others$Year <- factor(others$Year)
others.m = melt(others)

ggplot(others.m, aes(x=Year, y=value, group=variable, colour=variable)) +
  labs(title = "Drugs Implicated in Death", 
       subtitle = 'Others',
       x = '',
       y = "No. of Deaths") +
#  labs(caption = "Data prior to 2008 uses a different definition than the data from 2008.") +
  ylim(c(0,850)) +
  geom_vline(xintercept=which(opiates$Year == '2008'), colour="grey", linetype="dashed") +
  geom_line(size=1.1) +
  scale_color_manual(values = myPalette, labels = c('Gabapentin or\nPregabalin','Cocaine','Ecstasy','Amphetamines','Alcohol')) +
  scale_x_discrete(breaks = ybreaks) +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        legend.title = element_blank())

# factorise the years to turn into a label and melt
# top.drugs$Year <- factor(top.drugs$Year)
# top.drugs.m = melt(top.drugs)
# 
# ggplot(top.drugs.m , aes(x=Year, y=value, group=variable, colour=variable)) +
#   labs(title = "Drugs Implicated in Death", 
#        #subtitle = 'Others',
#        x = '',
#        y = "No. of Deaths") +
# #  labs(caption = "Data prior to 2008 uses a different definition than the data from 2008.") +
#  # ylim(c(0,850)) +
#   geom_line(size=1.1) +
#   scale_color_manual(values = myPalette, labels = c('Heroin or Morphine','Methadone','Etizolam','Gabapentin or\nPregabalin','Cocaine')) +
#   scale_x_discrete(breaks = ybreaks) +
#   theme_minimal() +
#   theme(panel.grid.major.x = element_blank(),
#         panel.grid.minor.x = element_blank(),
#         legend.title = element_blank())
# 

# ggsave("DRD_implicated_drug3.png", bg = 'white', width = 7, height =4, units = 'in')
```

NB: data prior to 2008 uses a different definition than the data from 2008. See notes in table 3.


## Table 4

Table 4 describes quite a lot of data represented in different ways. Each is presented as a different plot.

**NOTE:** in 2022 the data has been restructured so code needs a bit of an update. Most of the plots are currently missing.

```{r t4a, eval=FALSE}
# read data
table.dat = read_excel(xlfile, sheet = '4 - sex and age', col_names=FALSE)

# extract core data and convert to numbers  there are some
# columns used as fillers. Gah!
dat = table.dat[16:36,c(1,2,4,5,7:13,16:18)]
dat = data.frame(apply(dat, 2, as.numeric))

colnames(dat) <- c('Year','all','male','female','u14','15_24','25_34','35_44','45_54','55_64','o65','lq','median','uq')

ggplot(dat, aes(x=Year, y=median)) +
  labs(title = 'Drug-Related Deaths in Scotland',
          subtitle = "Median age at death", 
          caption = 'Shaded area represents the lower to upper quartile range of ages.',
          x = '',
          y = 'Age') +
  geom_line(size=1.2) +
  ylim(c(0,50)) +
  stat_summary(ymin=dat$lq, ymax=dat$uq, geom="ribbon", alpha=0.2, fill='red') +
  scale_x_continuous(breaks = ybreaks) +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        legend.title = element_blank())

#ggsave('DRD_median_age.png')
```


Median age of drug-related death is increasing from 30 in 2000 to 43 in 2020.

```{r t4b, eval=FALSE}
# calc percentage m/f deaths
dat = data.frame(dat, fempc = 100*dat$female/dat$all, malpc = 100*dat$male/dat$all)

dat$Year = factor(dat$Year)
dat.sex = dat[,c('Year','fempc','malpc')]
dat.sex.m = melt(dat.sex)

ggplot(dat.sex.m, aes(x=Year, y=value, group=variable, colour=variable)) +
  labs(title = 'Drug-Related Deaths in Scotland',
       subtitle = "Male vs female Deaths",
       x = '',
       y = "Percentage Deaths",
       caption = "Trend line: loess") +
  geom_smooth(method = 'loess', colour = 'grey') +
  geom_line(size=1.2) + 
  ylim(c(10,90)) +
  scale_color_discrete(name='Sex', breaks=c('malpc','fempc'),labels=c('Male','Female')) +
  scale_x_discrete(breaks = ybreaks) +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        legend.title = element_blank())

#ggsave("mf_plot.png")

```

Males are mostly affected by drug-related deaths. Females were increasing, but may have stablised.

```{r t4c}
## get data from lower table

calc_percent <- function(x) {
  year = x[1]
  x = x[-1]
  total = sum(x)
  pc.x = round(100*x/total,2)
  return(c(year, pc.x))
}

# read data
table.dat = read_excel(xlfile, sheet = '4 - sex and age', col_names=FALSE)

# extract male data and convert to numbers  there are some
# columns used as fillers. Gah!
mdat = table.dat[30:51,c(1,3:7)]
mdat = data.frame(apply(mdat, 2, as.numeric))

# turn into percentages
mdat = data.frame(t(apply(mdat,1,calc_percent)))
colnames(mdat) <- c('Year','u24','25-34','35-44','45-54','o55')

# extract female data and convert to numbers  there are some
# columns used as fillers. Gah!
fdat = table.dat[54:75,c(1,3:7)]
fdat = data.frame(apply(fdat, 2, as.numeric))

# turn into percentages
fdat = data.frame(t(apply(fdat,1,calc_percent)))
colnames(fdat) <- c('Year','u24','25-34','35-44','45-54','o55')

dat = rbind(mdat,fdat)

dat = data.frame(dat, Sex=rep(c('Male','Female'),each=nrow(mdat)))

dat$Year = factor(dat$Year)
dat.m = melt(dat)

ggplot(dat.m, aes(x=Year, y=value, fill=variable)) +
  labs(title = 'Drug-Related Deaths in Scotland',
       subtitle = 'Deaths by Sex and Age',
       x = '',
       y = 'Percentage Deaths') +
  geom_col() +
  facet_grid(Sex ~ .) +
  scale_fill_manual(name='Age', labels=c('under 25','25-34','35-44','45-54', 'over 55'), values = myPalette) +
  scale_x_discrete(breaks = ybreaks) +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        strip.background = element_rect(fill='grey'),
        legend.title = element_blank())

#ggsave("sex_age.png", bg = 'white')
```

In 2000 deaths were largely in the under 35s. In 2021 they are dominated by the 35 and over. 

## Table 5

```{r t5, fig.width=5.5, fig.height=6.5}
# read data
table.dat = read_excel(xlfile, sheet = '5 - sex age cause', col_names=FALSE)

# extract core data and convert to numbers
dat = table.dat[c(16:20,23:27),c(1,3:5,7)]
#dat = data.frame(apply(dat, 2, as.numeric))
colnames(dat) <- c("Age",'Drug Abuse','Accident','Suicide','Undetermined')
dat = data.frame(dat, Sex = rep(c("Male","Female"), each = 5))
dat.m = melt(dat, id=c('Age','Sex'))
dat.m$value = as.numeric(dat.m$value)
dat.m$Age = factor(dat.m$Age, levels = c('Under 25','25-34','35-44','45-54','55 and over'))

age.totals = dat.m %>% group_by(Age,Sex) %>% summarise(Total = sum(value))


ggplot(dat.m, aes(x=Age, y=value, fill=variable)) +
  ggtitle(sprintf("Drug-Related Deaths in Scotland %d", current_year), subtitle="By Sex, Age and Underlying Cause of Death") +
  ylab('No. Deaths') +
  geom_col() +
  facet_grid(Sex ~ .) +
  scale_fill_manual(values = myPalette) +
  geom_text(aes(x=Age, y=Total+14, label = Total, fill = NULL), data=age.totals, size=2.8) +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        strip.background = element_rect(fill='grey'),
        legend.title = element_blank())


```

## Table 6

```{r t6, fig.width=6, fig.height=5}

# read data
table6.dat = read_excel(xlfile, sheet = '6 - sex, age and drugs', col_names=FALSE)

# extract core data and convert to numbers
dat = table6.dat[c(24:28,32:36),c(1,3:18)]
colnames(dat) <-c('Age', 'Any opiate', 'Heroin.morphine', 'Methadone','Buprenorphin','Codeine', 'Dihydrocodeine', 'Anybenzo', 'Presbenzo', 'Diazepam', 'Streetbenzo', 'Etizolam', 'GabaPre', 'Cocaine', 'Ecstasy', 'Amphetamines', 'Alcohol')

# combine the codeine/dihydrocodeine cols and remove
dat = data.frame(dat, Codeines = as.numeric(dat$Codeine) + as.numeric(dat$Dihydrocodeine))

# subtract the diazepam data from
# the any prescsribed benzo data to make an "other pres benzo" column
dat = data.frame(dat, "Otherpbenzo"=as.numeric(dat$Presbenzo) - as.numeric(dat$Diazepam))

# similarly subtract etz for street benzo to
# give other street benzo
dat = data.frame(dat, "Othersbenzo"=as.numeric(dat$Streetbenzo) - as.numeric(dat$Etizolam))

# remove uninterest columns:
#   heroin or OST and any opiate are not very informative
#   Codeine/dihydrocodeine unnecessary now as they've been combined
#   any benzo is superceded by other benzos
# individual cases)
dat = dat[,c(1,3:5,10,12:20)]

dat = data.frame(dat, Sex = rep(c("Male","Female"), each = 5))
dat.m = melt(dat, id=c('Age','Sex'))
dat.m$value = as.numeric(dat.m$value)
dat.m$Age = factor(dat.m$Age, levels = c('Under 25','25-34','35-44','45-54','55 and over'))

ggplot(dat.m, aes(x=variable, y=value, fill=Age)) +
  ggtitle("Drug-Related Deaths in Scotland", subtitle="By Sex, Age and Implicated Drug") +
  ylab('No. Deaths') +
  xlab('') +
  geom_col() +
  facet_grid(Sex ~ .) +
  coord_flip() +
  scale_fill_manual(values = myPalette) +
#  c('Other street benzo', 'Other prescribed benzo', 'Codeines', 'Alcohol', 'Amphetamines', 'Ecstasy', 'Cocaine', 'Gapapentin or\nPregabalin', 'Etizolam', 'Diazepam', 'Methadone', 'Heroine or morphine')
  scale_x_discrete(labels = c( 'Heroine or morphine', 'Methadone', 'Buprenorphin', 'Diazepam', 'Etizolam', 'Gapapentin or\nPregabalin', 'Cocaine', 'Ecstasy', 'Amphetamines', 'Alcohol', 'Codeines', 'Other prescribed benzo', 'Other street benzo')) +
#  geom_text(aes(x=Age, y=Total+14, label = Total, fill = NULL), data=age.totals, size=2.8) +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        strip.background = element_rect(fill='grey'),
        legend.title = element_blank())


```

NB: Totals in this chart will be higher than total number of deaths as deaths can have more than one drug implicated.

## Table 9

```{r t9}
# read data
table8.dat = read_excel(xlfile, sheet = '9 - death rates by age', col_names=FALSE)

# extract core data and convert to numbers
dat = table8.dat[6:27,2:6]
dat = data.frame(apply(dat, 2, as.numeric))

# extract mean for last five years
yr5mean = as.numeric(table8.dat[33,7])

# add labels
names(dat) <- c("15-24","25-34","35-44","45-54","55-64")
dat = cbind(Year=factor(seq(2000,2021)), dat)

# melt data and plot
# note: dashed line is for average deaths 2012-2016 for ages 15-64
dat.m = melt(dat, variable.name = "Age")
ggplot(dat.m, aes(x=Year, y=value, group=Age, colour=Age)) +
  geom_hline(yintercept = yr5mean, colour="grey", linetype="dashed") +
  geom_line(size=1.2) +
  annotate("text", label="5-year Mean", x=2,y=yr5mean+1.5,size=3,colour="grey") +
  scale_x_discrete(breaks=ybreaks) +
  scale_color_manual(values=myPalette) +
  labs(title = sprintf("Drug-Related Deaths in Scotland %d", current_year),
       subtitle = "Rates by Age",
       x = '',
       y = 'Deaths per 100,000 population') +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank())


#ggsave('DRD_age_rates.png', bg = 'white')
```

Further evidence that DRDs are an age-related issue.

## Regional Changes - Table HB1

```{r tHB1}
# read data
table.dat = read_excel(xlfile, sheet = "HB1 - summary", col_names=FALSE)

# extract core data and convert to numbers
startyr = 2010
dat = table.dat[6:19,1:13]
dat = data.frame(dat[,1],apply(dat[,2:13], 2, as.numeric))
colnames(dat) = c('Region', seq(startyr,current_year,1))
dat[dat$Region == 'Highland 3' , 1] = 'Highland'
dat[dat$Region == 'Greater Glasgow & Clyde 3' , 1] = 'Greater Glasgow & Clyde'

# order regions by the previous year's data
dat$Region = factor(dat$Region, c(dat[order(dat[,ncol(dat)-1]),]$Region))

dat.m = melt(dat)

rplt = ggplot(dat, aes(x=Region, y=!!as.name(current_year))) +
  ggtitle("Drug-Related Deaths in Scotland", subtitle=sprintf("Change between 2010 - %d", current_year)) +
  geom_linerange(aes(ymin=`2010`,ymax=!!as.name(current_year)), colour='grey', size=1.5) +
  geom_point(aes(y=`2010`), size =1.2, colour='orange') +
  geom_point(aes(y=!!as.name(current_year)), size =1.2, colour='orange') +
  geom_point(aes(y=!!as.name(previous_year)), size =1.2, colour='darkgreen') +
  geom_text(aes(y=!!as.name(previous_year), label = ifelse(Region == "Greater Glasgow & Clyde", previous_year, '')), vjust = 1.4, size=2.2, colour="grey30") +
  geom_text(aes(y=`2010`, label = ifelse(Region == "Greater Glasgow & Clyde", '2010', '')), vjust = -0.5, size=2.6, colour="grey30") +
  geom_text(aes(y=!!as.name(current_year), label = ifelse(Region == "Greater Glasgow & Clyde", current_year, '')), vjust = -0.5, size=2.6, colour="grey30") +
  ylab('No. Deaths') +
  xlab('') +
  coord_flip() +
  theme_minimal()

```


```{r tHBmap, fig.width=8}

# shapefile data from: https://spatialdata.gov.scot/geonetwork/srv/eng/catalog.search#/metadata/f12c3826-4b4b-40e6-bf4f-77b9ed01dc14
sg_hb <- st_read('SG_NHS_HealthBoards_2019/SG_NHS_HealthBoards_2019.shp', quiet = TRUE)

sg_hb <- st_simplify(sg_hb, dTolerance = 75)
#st_geometry_type(sg_hb)
#st_crs(sg_hb)
#summary(sg_hb)

# sort regions alphabetically to match the DRD data
sg_hb_ordered = sg_hb[order(sg_hb$HBName),]

# add most recent deaths data 
deaths = dat$`2020`
df = cbind(sg_hb_ordered, deaths = deaths)

tayside = df %>% filter(HBName %in% c("Tayside", "Fife", "Grampian", "Lothian", "Forth Valley"))

mapplt = ggplot(df, aes(fill = deaths)) + 
  geom_sf(size = 0.4, color = "black") + 
  scale_fill_continuous(name = sprintf("Deaths\nin %s", current_year)) +
  coord_sf() + 
  theme_void() +
  theme(legend.position = 'bottom',
        legend.direction = 'horizontal',
        legend.key.size = unit(15, 'points'))

ggarrange(rplt, mapplt, ncol = 2, widths = c(3,1))

ggsave('DRD_region_change.png', bg = 'white')

```

```{r tHBb, fig.width=8}

# remove islands as numbers too small
#dat = dat[dat$`2017` > 5,]

# do percentage change between current year and previous
change.dat = dat %>% 
  mutate(Change = !!as.name(current_year) - !!as.name(previous_year)) %>% 
  mutate(PcChange = round(100*Change/!!as.name(previous_year),1)) %>% 
  select(Region,Change,PcChange)

# remove non-finite data
change.dat[!is.finite(change.dat$PcChange), 'PcChange'] <- NA

# add change data to map
df = cbind(sg_hb_ordered, Change = change.dat$Change, PcChange = change.dat$PcChange)

change.dat$Region = factor(change.dat$Region, levels = change.dat[order(change.dat$Change),]$Region)

plt1 = ggplot(change.dat, aes(x=Region, y=Change, fill="#D55E00")) +
  labs(title = sprintf("Drug-Related Deaths in Scotland %d", current_year), 
       subtitle="Year-on-year change by NHS health board",
       x = '',
       y = "Change in No. Deaths") +
  geom_col() +
  ylim(c(-25,25)) +
  coord_flip() +
  geom_text(aes(label = sprintf("%+d",Change)), hjust = ifelse(change.dat$PcChange > 0, -0.2, 1.1), size=3) +
  theme_minimal() +
  theme(panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank(),
        legend.position = "none")

plt2 = ggplot(df, aes(fill = Change)) + 
  geom_sf(size = 0.4, color = "black") + 
  scale_fill_gradient2(name = "YoY\nChange", 
                       low = muted("blue"),
                       mid = "white",
                       high = muted("red")) +
  coord_sf() + 
  theme_void() +
  theme(legend.position = 'bottom',
        legend.direction = 'horizontal',
        legend.key.size = unit(15, 'points'))

ggarrange(plt1, plt2, ncol = 2, widths = c(3,1))

#ggsave('DRD_YOY_region_change.png', bg = 'white')
```


```{r tHBc, fig.width=8}

plt1 = ggplot(change.dat, aes(x=fct_reorder(Region, PcChange), y=PcChange, fill="#D55E00")) +
  labs(title = sprintf("Drug-Related Deaths in Scotland %d", current_year), 
       subtitle="Year-on-year percentage change by NHS health board",
       x = '',
       y = "Change in Deaths (%)") +
  geom_col() +
  ylim(c(-75,75)) +
  coord_flip() +
  geom_text(aes(label = sprintf("%+.0f%%",PcChange)), hjust = ifelse(change.dat$PcChange > 0, -0.1, 1.1), size=3) +
  theme_minimal() +
  theme(panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank(),
        legend.position = "none")

plt2 = ggplot(df, aes(fill = PcChange)) + 
  geom_sf(size = 0.4, color = "black") + 
  scale_fill_gradient2(name = "YoY\n% Change", 
                       low = muted("blue"),
                       mid = "white",
                       high = muted("red"),
                       limits = c(-40,100)) +
  coord_sf() + 
  theme_void() +
  theme(legend.position = 'bottom',
        legend.direction = 'horizontal',
        legend.key.size = unit(15, 'points'))

ggarrange(plt1, plt2, ncol = 2, widths = c(3,1))

#ggsave('DRD_YOY_pc_region_change.png', bg = 'white')

```


## Council Comparisons 

```{r tC1a, fig.height = 6}

## experiment with small multiples
# read data
table.dat = read_excel(xlfile, sheet = "C1 - summary", col_names=FALSE)

# extract core data and convert to numbers
startyr = 2010
dat = table.dat[6:37,1:13]
colnames(dat) = c('Council',seq(startyr,current_year,1))
dat$Council = c("Aberdeen", "Aberdeenshire", "Angus", "Argyll & Bute", "Edinburgh", "Clack'shire", "Dumfries & Ga.", "Dundee", "E. Ayrshire", "E. Dunb'shire", "E. Lothian", "E. Renf'shire", "Falkirk", "Fife", "Glasgow", "Highland", "Inverclyde", "Midlothian", "Moray", "Na h-Eil. Siar", "N. Ayrshire", "N. Lanarkshire", "Orkney Islands", "Perth & Kinross", "Renfrewshire", "Scot. Borders", "Shetland Islds", "South Ayrshire", "S. Lanarkshire", "Stirling", "W. Dunb'shire", "West Lothian")
dat.m = melt(dat)

ggplot(dat.m, aes(x=variable, y=value, group=Council, colour=ifelse(dat.m$Council == "Dundee", "A","B"))) +
  ggtitle(sprintf("Drug-Related Deaths in Scotland by Council %d-%d",startyr,current_year)) +
  xlab('Year') +
  ylab('No. Deaths') +
  geom_line(size=1.1) +
  scale_x_discrete(breaks=c(startyr, current_year), labels=c("'10","'20")) +
  scale_colour_manual(values=c('orange','grey')) +
  facet_wrap(~Council) +
  theme_minimal() +
  theme(legend.position = "none",
        panel.grid.major.y = element_line(colour = 'lightgrey'),
        panel.grid.major.x = element_blank(),
        panel.grid.minor = element_blank())

#ggsave('coucil_multi.png', bg = 'white')
```


```{r tC4, fig.width=7, fig.height = 6}
# read data
table.dat = read_excel(xlfile, sheet = "C4 - age-stand death rates", col_names=FALSE)

# extract core data and convert to numbers
#dat = table.dat[c(8,10:41),c(1:12,17)]
startyr = 2004
dat = table.dat[c(7:39),c(1:19)]
dat = data.frame(dat[,1],apply(dat[,2:19], 2, as.numeric))
dat[is.na(dat)] <- 0
names(dat) = c('Council',seq(startyr,current_year))

# remove whole-of-scotland data
scot.dat = dat[1,]
dat = dat[-1,]

# order factors by DRD rate
dat$Council = factor(dat$Council, levels=dat[order(dat$`2021`),1])

ggplot(dat, aes(x=Council, y=`2021`, fill=factor(ifelse(dat$Council=="Dundee City", "Y","N")))) +
  geom_col() +
  #ylim(c(0,490)) +
  ggtitle(sprintf("5 Year Drug-Related Death Rate in Scotland 2017-2021"), 
          subtitle = "By council per 100,000 population") +
  ylab("Drug-Related Deaths (per 100,000 population)") +
  labs(caption="Using 2018 population data") +
  geom_hline(aes(yintercept=scot.dat$`2021`), colour="#990000", linetype="dashed") +
  geom_text(aes(label = round(`2021`,1)), hjust=-0.1, size=2.5) +
  scale_fill_manual(values=c('grey','orange'))+
  annotate("text", label=sprintf("Scotland (%.1f)",scot.dat$`2021`), x=4, y=scot.dat$`2021`+1,colour="grey", angle=-90, size=2.5) +
  coord_flip() +
  theme_minimal() +
  theme(panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank(),
        legend.position = "none")

#ggsave("council_5yr_rate.png", bg = 'white')
```


```{r tC3, fig.height=10, fig.width=8}
# read data
table.dat = read_excel(xlfile, sheet = "C3 - drugs implicated", col_names=FALSE)

# extract core data and convert to numbers
dat = table.dat[c(13:44),c(1:18)]
councils = dat[,1]
dat = data.frame(apply(dat[,2:18], 2, as.numeric))
dat = data.frame(councils, dat)
colnames(dat) <-c('Council','Totals', 'Any opiate', 'Heroin.morphine','Methadone','Buprenorphin', 'Codeine', 'Dihydrocodeine', 'Anybenzo', 'Presbenzo', 'Diazepam', 'Streetbenzo', 'Etizolam', 'GabaPre', 'Cocaine', 'Ecstasy', 'Amphetamines', 'Alcohol')

dat$Council = c("Aberdeen", "Abrdshr", "Angus", "Argyll", "Edinburgh", "Clk", "DnG", "Dundee", "E. Ayr", "E. Dun", "E. Loth", "E. Ren", "Falkirk", "Fife", "Glasgow", "Highland", "Inverclyde", "Midlothian", "Moray", "Na h-Eil", "N. Ayr", "N. Lanark", "Orkney", "PnK", "Renfrew", "SBord", "Shetld", "S. Ayr", "S. Lanark", "Stirling", "W. Dun", "West Loth")

dat = dat[dat$Totals > 40,]

list.councils = dat$Council

# combine the codeine/dihydrocodeine cols and remove
dat = data.frame(dat, Codeines = as.numeric(dat$Codeine) + as.numeric(dat$Dihydrocodeine))

# subtract the diazepam data from
# the any prescsribed benzo data to make an "other pres benzo" column
dat = data.frame(dat, "Otherpbenzo"=as.numeric(dat$Presbenzo) - as.numeric(dat$Diazepam))

# similarly subtract etz for street benzo to
# give other street benzo
dat = data.frame(dat, "Othersbenzo"=as.numeric(dat$Streetbenzo) - as.numeric(dat$Etizolam))

# remove uninterest columns:
#   heroin or OST and any opiate are not very informative
#   Codeine/dihydrocodeine unnecessary now as they've been combined
#   any benzo is superceded by other benzos
# individual cases)
dat = dat[,c(1,2,4:6,11,13:21)]

dat.pc = data.frame(dat$Council, 100*dat[,3:15]/dat$Totals)
dat.pc.m = melt(dat.pc)

drug_names = list(
  'Heroin.morphine' = "Heroin or\nMorphine",
  'Methadone' = 'Methadone',
  'Buprenorphin' = 'Buprenorphin',
  'Diazepam' = "Diazepam",
  'Etizolam' = 'Etizolam',
  'GabaPre' = "Gabapentin or\nPregabalin",
  'Cocaine' = 'Cocaine',
  'Ecstasy' = 'Ecstasy',
  'Amphetamines' = 'Amphetamines',
  'Alcohol' = 'Alcohol',
  'Codeines' = 'Codeines',
  'Otherpbenzo' = "Other prescribed\nbenzodiazepines",
  'Othersbenzo' = "Other 'street'\nbenzodiazepines"
)
drug_labeller = function(string) {
  return(drug_names[string])
}


ggplot(dat.pc.m, aes(x=dat.Council, y=value, fill=factor(ifelse(dat.Council=="Dundee", "Y","N")))) +
  ggtitle(sprintf("Drugs Implicated in Death in Scotland %d", current_year), subtitle="For councils with >40 deaths") +
  xlab("Council") +
  ylab("Percentage Deaths") +
  geom_col() +
  geom_text(aes(label=sprintf("%.1f%%",value)), size=3, vjust=-0.5) +
  scale_y_continuous(limits = c(0,100), breaks = c(0, 50, 100)) +
  scale_fill_manual(values=c('grey','orange'))+
  facet_grid(variable ~ ., labeller = labeller(variable=drug_labeller)) +
  theme_minimal() +
  theme(strip.text.y = element_text(angle = 0),
        legend.position = "none")

#ggsave("DRD_council_implicated_drug.png", height = 10, width= 8, units = 'in', dpi=150, bg = 'white')

```

```{r tC5, fig.height=6}
# read data
table.dat = read_excel(xlfile, sheet = "C5 - rates by age-group", col_names=FALSE)

# extract core data and convert to numbers
dat = table.dat[c(11:42),c(1:6)]
councils = dat[,1]
dat = data.frame(apply(dat[,2:6], 2, as.numeric))
dat = data.frame(councils, dat)
colnames(dat) <-c('Council','15-24','25-34','35-44','45-54','55-64')

dat$Council = c("Aberdeen", "Abrdshr", "Angus", "Argyll", "Edinburgh", "Clk", "DnG", "Dundee", "E. Ayr", "E. Dun", "E. Loth", "E. Ren", "Falkirk", "Fife", "Glasgow", "Highland", "Inverclyde", "Midlothian", "Moray", "Na h-Eil", "N. Ayr", "N. Lanark", "Orkney", "PnK", "Renfrew", "SBord", "Shetld", "S. Ayr", "S. Lanark", "Stirling", "W. Dun", "West Loth")

dat = dat[dat$Council %in% list.councils,]
dat.m = melt(dat)

ggplot(dat.m, aes(x=variable, y=value, group=Council, colour=ifelse(Council == "Dundee", "A","B"))) +
  ggtitle("Drug-Related Deaths in Scotland per 100,000 population", subtitle = "By council area (annual average 2017-2021)") +
  ylab("Death Rate") +
  xlab("Age Group") +
  geom_line() +
  geom_point(size=1.6, colour="black") +
  geom_point(size=0.8, colour='white') +
  scale_y_continuous(limits = c(0, 155), breaks = c(0,50,100,150)) +
  scale_colour_manual(values=c('orange','black')) +
  geom_text(aes(label=round(value,0)), vjust = ifelse(dat.m$value < 140, -0.6, 1.4), size=2.4, colour="black") +
  facet_grid(Council ~.) +
  # tried to add the "All ages" average to the plots,
  # got it looks too busy and tricky to format right. Removing for time being
#  geom_hline(aes(yintercept=AllAges), data=all.ages, colour='grey',linetype='dashed') +
#  geom_text(aes(x='55-64',y=AllAges, label=AllAges), data=all.ages, vjust=-0.6,hjust=-3, colour='grey', size=2.4) +
#  geom_text(aes(x='55-64',y=AllAges, label='All ages'), data=all.ages, vjust=0.6,hjust=-2, colour='grey', size=2.4) +
  theme_minimal() +
  theme(strip.text.y = element_text(angle=0),
        legend.position = "none")

#ggsave("DRD_council_age_group.png", height = 6, width= 6, units = 'in', dpi=150, bg = 'white')
```

## Different Definitions

```{r tx}

# read data
tableX.dat = read_excel(xlfile, sheet = "X - diff defs", col_names=FALSE)

# extract core data and convert to numbers
dat.num = tableX.dat[c(44:55),c(1:4)]
colnames(dat.num) = c('Year','NRS','ONS','EMCDDA')
dat.num.l = dat.num %>% gather(Method, Counts, -Year)

ggplot(dat.num.l, aes(x = Year, y = as.numeric(Counts), group = Method, colour = Method)) +
  geom_line() +
  geom_point(data = dat.num.l %>% filter(Year == current_year)) +
  geom_text(data = dat.num.l %>% filter(Year == current_year), aes(label = as.numeric(Counts)), colour = 'grey4', size = 2.4, hjust = -0.2) +
  scale_color_manual(values = myPalette) +
#  scale_x_discrete(breaks = ybreaks) +
  labs(title = 'Drug-Related Deaths in Scotland',
       subtitle = 'Comparison of different national definitions',
       x = '',
       y = 'No. Deaths') +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        legend.title = element_blank())

#ggsave('DRD_diff_defns.png', bg = 'white')
```


## UK Nations Comparison

*Note:* Not updated since August 2019

```{r uk, eval=FALSE}

# read region std def data spreadsheet
table2.dat = read_excel(xlfile2, sheet = 3, col_names=FALSE)

# extract data and convert to numbers
dat.std = table2.dat[c(26:28,30,31),c(1,3:5)]
names(dat.std) <- c('Country','2016','2017','2018')
dat.std$`2017` = as.numeric(dat.std$`2017`)
dat.std$defn = 'NRS'

# read region ONS def data spreadsheet
table2.dat = read_excel(xlfile2, sheet = 4, col_names=FALSE)

# extract data and convert to numbers
dat.ons = table2.dat[c(26:28,30,31),c(1,3:5)]
names(dat.ons) <- c('Country','2016','2017','2018')
dat.ons$`2016` = as.numeric(dat.ons$`2016`)
dat.ons$defn = 'ONS'

dat = rbind(dat.std, dat.ons)
dat.l = dat %>% gather(year, perm, `2016`:`2018`)

ggplot(dat.l, aes(x = year, y = perm, group=Country, colour=Country, linetype = ifelse(Country == 'UK', 'a', 'b'))) +
  labs(title = 'Drug-Related Deaths in the UK',
       subtitle = 'Comparison of different definitions',
       x = '',
       y = 'Deaths per million population') +
  geom_line() +
#  geom_line(data = dat.l[dat.l$Country == 'UK',], aes(x = year, y = perm), colour = 'black') +
  ylim(c(0,250)) +
  scale_colour_manual(values = myPalette) +
  scale_linetype_manual(values = c( 'dashed', 'solid'), guide = FALSE) +
  facet_grid(. ~ defn) +
  theme_minimal() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        legend.title = element_blank())


```


## European Comparison

This data is an attempt to compare the Scotland data with the rest of Europe and
the UK. It is an attempt as the UK (i.e. the ONS) and EU (i.e. EMCDDA) define
drug-related deaths slightly differently. The NRS have applied the methodology
used by the EMCDDA to the Scottish data to get comparable figures.

The current Scottish data is also more up-to-date than the EU's data so a 
comparison also is made using data covering the same time periods.

```{r emcdda}

# read data
tableEU.dat = read_excel(xlfile, sheet = "EMCDDA - drug-induced deaths", col_names=FALSE)

# extract core data and convert to numbers
dat = tableEU.dat[c(13:40),c(1,4,6)]
colnames(dat) = c('Country','Number','PerM')

# drop Germany as there's no data
dat = dat %>% filter(Country != 'Germany')
dat$Number = as.numeric(dat$Number)
dat$PerM = as.numeric(dat$PerM)
#dat.num.l = dat.num %>% gather(Method, Counts, -Year)

# get UK data as it's no longer together with the EU data
gb = tableEU.dat[57,c(2,4,6)]
colnames(gb) = c('Country','Number','PerM')
gb$Country = 'GB'
gb$PerM = as.numeric(gb$PerM)

# get Scotland data
scot = tableEU.dat[c(50,52),c(2,4,6)]
colnames(scot) = c('Country','Number','PerM')
scot$Country = c('Scot - likeGB', 'Scot - likeEU')
scot$PerM = as.numeric(scot$PerM)

# join with other data
dat = rbind(dat, gb, scot)

# fix names
dat[dat$Country == 'European Union  2', 'Country'] <- 'EU'

# define colour categories
dat$Category = c(rep('A', 26), 'B', 'A', rep('C', 2))

ggplot(dat, aes(x = fct_reorder(Country,PerM), y = PerM, fill=Category)) + 
  labs(title = sprintf("Drug-Related Deaths in Scotland 2020", current_year),
       subtitle = "Comparison with EU countries (ages 15-64)",
       caption = "Most recent EU data is for 2020",
       y = 'Deaths per million population',
       x = '') +
  geom_col() +
  geom_text(aes(label = PerM), hjust = ifelse(dat$PerM < 4, -0.6, 1.2), colour = ifelse(dat$Category == 'A', 'grey4', 'white'), size = 2) +
  coord_flip() +
  scale_fill_manual(values = c('grey', 'darkred', '#0A4FAF')) +
  theme_minimal() +
  theme(panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank(),
        legend.position = 'NA')

#ggsave('DRD_EMCDDA_comparison.png', bg = 'white')
```