-
Notifications
You must be signed in to change notification settings - Fork 31
Description
I routinely run my models with 8 chains and 1e4 iterations per chain. For continuous predictions I then compute predictive distributions for each predictor value in a sequence of 100–200. This is often necessary to correctly represent nonlinear predictions.
I would like to use the point_interval functions to summarise my predictions, but have run into performance issues. Once the number of groups exceed a certain limit, point_interval crashes R. This is not the case for a summarise call that uses the interval functions and yields exactly the same output.
For example,
prediction %>%
group_by(group1, group2, group3) %>%
median_qi(var1, var2, var3, .width = c(.5, .8, .9))
fails where
prediction %>%
group_by(group1, group2, group3) %>%
summarise(
across(
c(var1, var2, var3),
list(median = median,
lower_0.5 = ~ qi(.x, .width = .5)[1],
upper_0.5 = ~ qi(.x, .width = .5)[2],
lower_0.8 = ~ qi(.x, .width = .8)[1],
upper_0.8 = ~ qi(.x, .width = .8)[2],
lower_0.9 = ~ qi(.x, .width = .9)[1],
upper_0.9 = ~ qi(.x, .width = .9)[2]),
.names = "{.col}.{.fn}"
)
) %>%
ungroup() %>%
rename(var1 = var1.median, var2 = var2.median, var3 = var3.median) %>%
pivot_longer(cols = contains("lower") | contains("upper")) %>%
separate(col = name, into = c("name", ".width"), sep = "_(?=[^_]*$)") %>%
pivot_wider(names_from = name, values_from = value)
doesn't.
I really appreciate the succinctness of the point_interval functions, but find myself having to resort to my own summarise functions. Perhaps you could rethink this family of functions to make them as efficient as summarise?