-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support simultaneous stacking and dodging by different variables in geom_col #6324
Comments
Thanks for the report! This request is similar to #2267, which was closed as unplanned. |
Thank you! In the meantime (with great help of various AIs) I developed an ad hoc geom. I still think that a position_ function is more appropriate since it could accommodate other geoms too (and I don't like the idea of a geom just for positioning) but I wasn't able to make one. Regarding whether to put it ggplot or not I would advise for the first solution. I was very surprised in the first place this was not possible already, it's something one would expect out of the box! GeomStackDodgeCol <- ggproto(
"GeomStackDodgeCol", GeomRect,
required_aes = c("x", "y", "fill", "group"),
default_aes = aes(
colour = "black",
linewidth = 0.5,
linetype = 1,
alpha = NA
),
setup_data = function(data, params) {
# Reset stacking for each x value and fill group
data <- data |>
group_by(x, fill) |>
mutate(
ymin = c(0, head(cumsum(y), -1)),
ymax = cumsum(y)
) |>
ungroup()
# Compute dodging offsets with width and padding
fill_groups <- unique(data$fill)
n_groups <- length(fill_groups)
width <- params$width %||% 0.9 # width of the bars
padding <- params$padding %||% 0.1 # padding between bars
# Calculate total width needed for the group
total_width <- n_groups * width + (n_groups - 1) * padding * width
# Calculate positions with proper spacing
positions <- seq(-total_width/2, total_width/2, length.out = n_groups)
# Create rectangle coordinates
data$xmin <- data$x + positions[match(data$fill, fill_groups)] - width/2
data$xmax <- data$x + positions[match(data$fill, fill_groups)] + width/2
data
},
draw_panel = function(data, panel_params, coord, width = 0.9, ...) {
coords <- coord$transform(data, panel_params)
grid::rectGrob(
x = (coords$xmin + coords$xmax)/2,
y = (coords$ymin + coords$ymax)/2,
width = coords$xmax - coords$xmin,
height = coords$ymax - coords$ymin,
default.units = "native",
just = c("center", "center"),
gp = grid::gpar(
col = coords$colour,
fill = alpha(coords$fill, coords$alpha),
lwd = coords$linewidth * .pt,
lty = coords$linetype
)
)
},
parameters = function(complete = FALSE) {
c("na.rm", "width", "padding")
}
)
geom_stackdodge_col <- function(mapping = NULL, data = NULL,
position = "identity",
width = 0.9,
padding = 0.1,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE, ...) {
layer(
geom = GeomStackDodgeCol,
mapping = mapping,
data = data,
stat = "identity",
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
na.rm = na.rm,
width = width,
padding = padding
)
)
} of course testing is mandated. Here's some testing code: local({
df <- bind_rows(
data.frame(
year = rep(2016, 5),
protocol = rep("M", 5),
country = c("A", "B", "C", "D", "E"),
freq = c(100, 50, 30, 40, 11) # sum is 231
),
data.frame(
year = rep(2016, 4),
protocol = rep("L", 4),
country = c("A", "B", "C", "D"),
freq = c(23, 60, 200, 100) # sum is 383
)
)
# Add more years
df <- bind_rows(
df,
df |> mutate(year = 2017, freq = sample(freq)),
)
# Create summary data
df_sum <- df |>
summarise(
label = paste(country, collapse = "\n"),
freq = sum(freq),
.by = c(year, protocol)
)
ggplot() +
geom_stackdodge_col(
data = df,
aes(x = factor(year), y = freq, group = country,
fill = protocol),
width = 0.1, padding = 0.5
) +
geom_hline(yintercept = c(sum(c(100, 50, 30, 40, 11), sum(c(23, 60, 200, 100) )) # To show that the bars sum up to the expected values
}) |
We have for many years now followed the philosophy that only the absolute core features are in ggplot2 itself and other, less commonly used features should go into extension packages. Maybe this would be a good fit for ggforce for example. Also, while I'm of the opinion that everybody should be allowed and empowered to make any visualization they want, I find it difficult to think of a valid use case for this geom. I've never in my life thought "hm, I want to stack and dodge at the same time." This is definitely an obscure corner case, and I feel reasonably confident that any figure you make with this feature can be improved by removing one of the two position adjustments. |
Uhm, it's a pretty common scenario in epidemiology! Should I cross post it to ggforce? Do they work also on position functions or only on geoms? |
I'd like to propose adding support for simultaneous stacking and dodging controlled by different variables in geom_col. Currently, this common visualization need requires workarounds that are both verbose and harder to maintain.
Current Limitation
When using geom_col, we can either stack or dodge bars based on a grouping variable, but not both at the same time using different variables. This makes it difficult to create visualizations where we want to:
Here's a reprex with counts from surveillance data stratified by year, country and surveillance protocol
Minimal Reproducible Example
Desired Behavior
Ideally, we would be able to specify both stacking and dodging variables in a single geom_col call, something like:
Use Cases
This functionality would be particularly useful for:
Benefits
The text was updated successfully, but these errors were encountered: