Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use illustration of local and global scores #934

Open
nikosbosse opened this issue Sep 28, 2024 · 1 comment
Open

Use illustration of local and global scores #934

nikosbosse opened this issue Sep 28, 2024 · 1 comment
Labels
documentation Improvements or additions to documentation

Comments

@nikosbosse
Copy link
Contributor

nikosbosse commented Sep 28, 2024

We have this nice illustration:

image

image

It's not used in the manuscript anymore. But it would be nice to use it in a vignette explaining differences between scores.

Related: #933, #929, #930

This was the code to generate the plot:

library(scoringutils)
library(ggplot2)
library(data.table)

quantiles <- seq(0.1, 1, 0.1)
forecast_a <- c(0.3, 0.35, 0.25, 0.04, 0.02, 0.01, 0.01, 0.01, 0.005, 0.005)
forecast_b <- c(0.1, 0.35, 0.05, 0.02, 0.01, 0.01, 0.05, 0.07, 0.2, 0.14)
true_value <- 2

df <- data.table(
  forecaster = rep(c("Forecaster A", "Forecaster B"), each = 10),
  outcome = rep(1:10, 2),
  prob = c(forecast_a, forecast_b),
  true_value = true_value
)

df[, crps := sum((cumsum(prob) - (outcome >= true_value))^2),
   by = c("forecaster")]
df[, log_score := -log(prob[outcome == true_value]),
   by = c("forecaster")]
df[, mean_pred := sum(prob * outcome) / sum(prob),
   by = c("forecaster")]
df[, sd_pred := sqrt(sum((prob * outcome - mean_pred)^2)),
   by = c("forecaster")]
df[, log_score := -log(prob[outcome == true_value]),
   by = c("forecaster")]
df[, dss := ((true_value - mean_pred)^2) / sd_pred + 2 * log(sd_pred),
   by = c("forecaster")]

# sense-check: compute crps using samples
sample_a <- sample(x=1:10, size = 1e5, replace = TRUE, prob = forecast_a)
sample_b <- sample(x=1:10, size = 1e5, replace = TRUE, prob = forecast_b)

crps_a <- scoringutils::crps_sample(2, t(as.matrix(sample_a)))
crps_b <- scoringutils::crps_sample(2, t(as.matrix(sample_b)))

annotation <- df[, .(forecaster, crps, log_score, dss)] |> unique()


ggplot(df, aes(x = factor(outcome), y = prob)) +
  geom_col() +
  geom_text(data = annotation, x = 4, y = 0.3, hjust = "left", size = 3,
            aes(label = paste("CRPS: ", round(crps, 2)))) +
  geom_text(data = annotation,x = 4, y = 0.27, hjust = "left", size = 3,
            aes(label = paste("Log score: ", round(log_score, 2)))) +
  geom_text(data = annotation, x = 4, y = 0.24, hjust = "left", size = 3,
            aes(label = paste("DSS: ", round(dss, 2)))) +
  facet_wrap(~ forecaster) +
  geom_vline(aes(xintercept = 2), linetype = "dashed") +
  theme_scoringutils() +
  labs(y = "Probability assigned", x = "Possible outcomes")

ggsave("inst/manuscript/output/score-locality.png", height = 3, width = 8)

@nikosbosse nikosbosse added the documentation Improvements or additions to documentation label Sep 28, 2024
@nikosbosse nikosbosse added this to the scoringutils-2.x milestone Sep 28, 2024
@seabbs
Copy link
Contributor

seabbs commented Sep 30, 2024

Same thoughts as for #933. This is nice and we should plan to use it at some point .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
Development

No branches or pull requests

2 participants