do not limit range of brakes in subguide_axis #219

venpopov · 2024-03-29T03:07:17Z

Based on the discussion in #183, I've been trying to plot the density of distributions showing the density on the y axis. The following example works well, but I find it aesthetically displeasing that the breaks get cut-off - the maximum break showed is 0.3, because the max density value is ~0.39. In standard ggplot this is not limited in such a way and you have more control over the breaks. Even if I supply the breaks = seq(0,0.4,0.1), the top break still gets cut-off.

library(ggplot2)
library(ggdist)
library(dplyr)
data.frame(prior = c("normal(0,1)", "normal(0,2)")) %>% 
  parse_dist(prior) %>% 
  ggplot(aes(xdist = .dist_obj, color = prior)) +
  stat_slab(fill = NA, subguide = subguide_outside(title = "density", 
            breaks = seq(0,1,0.1))) +
  ylab(NULL) +
  theme_ggdist() +
  theme(plot.margin = margin(5.5, 5.5, 5.5, 50),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank())

^{Created on 2024-03-29 with reprex v2.1.0}

This is caused by the following line in subguide_axis():

breaks = oob_discard(scale$get_breaks(), limits)

removing this line gives me what I want:

My suggestion is either to get rid of this line, or if you need it for other functionality, to add an argument to subguide_* functions to disable it

mjskay · 2024-03-31T15:26:38Z

This filtering is necessary because breaks are not guaranteed to be within the limits, and otherwise ticks get drawn outside of the geom bounds in many common situations where default breaks functions try to return "nice-looking" limits.

Base ggplot2 scales do the same thing; e.g.:

ggplot() + 
  geom_point(aes(x = 1:5, y = 1:5)) + 
  scale_x_continuous(breaks = -10:10)

To adjust the limits manually, you can set limits on scale_thickness_shared(); e.g.:

library(ggplot2)
library(ggdist)
library(dplyr)
data.frame(prior = c("normal(0,1)", "normal(0,2)")) %>% 
  parse_dist(prior) %>% 
  ggplot(aes(xdist = .dist_obj, color = prior)) +
  stat_slab(fill = NA, subguide = subguide_outside(title = "density", 
            breaks = seq(0,1,0.1))) +
  scale_thickness_shared(limits = c(0,0.4)) +
  ylab(NULL) +
  theme_ggdist() +
  theme(plot.margin = margin(5.5, 5.5, 5.5, 50),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank())

venpopov · 2024-03-31T16:39:05Z

Unfortunately that's not a great solution for the following reasons:

This requires that I know beforehand the right upper limit. I'm building this as part of a package to have a convenient function to visualize our priors. To use this, I would have to separately calculate the maximum value of all the pdfs. While doable, it will add a lot of extra code and computation. This is because this is a hard fixed limit (e.g. limits = c(0,1)):

We actually use facets by default with normalize="group", because the scales of many priors could be vastly different. So a more typical use case would be, where the suggestion you gave doesn't work (because it forces the same scale; with scale_thickness_identity it doesn't work)

data.frame(prior = c("student_t(3,2,0.75)", "normal(0,5)")) %>% 
  parse_dist(prior) %>% 
  ggplot(aes(xdist = .dist_obj)) +
  stat_slab(fill = NA, color = "red",
            subguide = subguide_outside(title = "density"),
            normalize = 'groups',
            breaks = seq(0,1,0.1)) +
  scale_thickness_identity(limits = c(0,0.7)) +
  ylab(NULL) +
  theme_ggdist() +
  theme(plot.margin = margin(5.5, 5.5, 5.5, 50),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank()) +
  facet_wrap(~prior, scales = "free")

These two issues unfortunately prevent me from using the solution you suggested. However, if I manually change the line

breaks = oob_discard(scale$get_breaks(), limits)

to

breaks = scale$get_breaks()

I get a much nicer result, without having to specify anything and just using the defaults:

### with manually disabling the source code as described above
data.frame(prior = c("student_t(3,2,0.75)", "normal(0,5)")) %>% 
  parse_dist(prior) %>% 
  ggplot(aes(xdist = .dist_obj)) +
  stat_slab(fill = NA, color = "red",
            subguide = subguide_outside(title = "density"),
            normalize = 'groups') +
  ylab(NULL) +
  theme_ggdist() +
  theme(plot.margin = margin(5.5, 5.5, 5.5, 50),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank()) +
  facet_wrap(~prior, scales = "free")

This is because scale$get_breaks() produces in this case a vector like [0.1, 0.2, 0.3, 0.4, 0.5], but the upper limit is 0.48, so oob_discard just removes that and leaves a lot of empty space on top of the scale.

If you say disabling this will break other things, maybe you could consider making a user option ot disable it?

The example you give about the default ggplot behavior is actually closer to what I want. Consider this slightly adjusted code:

ggplot() + 
  geom_point(aes(x = 1:5, y = c(1,2,3,4,4.9)))

This does not cut-off the break at 5, because the maximum y value is less than that. But in subguide_axis it does.

It seems to me that scale$get_breaks() already determines a good range, why is it necessary to truncate it further?

mjskay · 2024-04-01T04:53:12Z

I'm building this as part of a package to have a convenient function to visualize our priors.

Cool! This would be a great thing to have, so I definitely want to figure out something that works. Re-opening.

If you say disabling this will break other things, maybe you could consider making a user option ot disable it?

Heh believe me, it will ;). In fact, this line used to be exactly what you're asking me to change it to. The problem is that because many breaks functions may return values outside of the limits in order to make "nice-looking" breaks (and these values can sometimes be surprisingly far from the limits), removing this line means that axis labels of adjacent slabs (e.g. when using a categorical axis) can end up overlapping. It is a brittle solution that breaks the core abstraction of sub-axes with defined limits, potentially leading to unexpected behavior, like axis labels drawn outside the bounds of the chart (not just the geom). So, I don't think this is the right solution.

The example you give about the default ggplot behavior is actually closer to what I want. Consider this slightly adjusted code:

What's happening in this example is not that ggplot is adjusting limits based on the breaks, it's that the default expansion parameter of continuous positional scales happens to be large enough that the break outside of the limits is still drawn. You can see that if you don't expand the axis limits:

ggplot() + 
    geom_point(aes(x = 1:5, y = c(1,2,3,4,4.9))) +
    scale_y_continuous(expand = expansion(0))

Fundamentally, the problem is that limits should not be determined by the breaks --- by the time breaks come into play, limits have been determined much earlier in the pipeline. If you want the scale to show more breaks, you need a way to expand the limits such that those breaks will be visible.

To me this suggests a few possibilities:

something like the expand argument, which would allow expansion of the top limit of the scale by either a fixed offset or a multiple of the scale range.
providing something like the limits argument to scale_ functions, but at a geom-level. It's worth pointing out that because limits in scales_ functions does not need to be fixed (it can be a function of the data that returns appropriate limits), if you want to set limits using your own logic, you should not need to pre-calculate them, you can instead create a function to pass to limits that does what you need.

More generally, I think the right solution might be to allow people to pass their own scale function to slab geometries that allows customization of both limits and expand arguments. Such a scale would then be applied within the groups determined by the normalize parameter. This might be the best solution long-term, though it would require ripping some stuff out and re-organizing it.

mjskay · 2024-04-02T19:12:55Z

Playing with a way to fix this by exposing greater control of sub-scales. Something like this mimics the default position scales by expanding the upper thickness limit by 5%:

ggplot() + 
  stat_slab(
    aes(xdist = dist_normal()), 
    subguide = subguide_inside(title = "density"), 
    subscale = subscale_thickness(expand = expansion(c(0, 0.05)))
  )

Check out the subscale branch if you want to play with it. It's very draft-y at the moment and I'd be happy to iterate on it if it doesn't do what you need.

Note that, along with this I am considering renaming the normalize arg to something like subscale_by, which I think better communicates its intent. If I do that it will probably cause deprecation warnings in your package if you are setting the normalize parameter, so I'll let you know if I decide to do that.

venpopov · 2024-04-02T20:05:25Z

Cool, thanks for working on that! I'll try to take a look in the next day or two. I just finished the derivatives PR for distributional so I should have some time to get back to this

mjskay · 2024-04-05T18:15:01Z

Did a bit more tweaking and I'm happy with this for now, so it is merged to master now. I'm closing this, but don't hesitate to re-open if it's not solving your problem; I'm happy to iterate.

I also decided I am likely to rename normalize; if you want to follow that issue in case it affects you it is here: #223

venpopov · 2024-04-06T14:40:10Z

Can confirm this solves my problem. Thanks!

venpopov changed the title ~~do not limit range of braks in subguide_axis~~ do not limit range of brakes in subguide_axis Mar 29, 2024

mjskay closed this as completed Mar 31, 2024

mjskay reopened this Apr 1, 2024

mjskay mentioned this issue Apr 5, 2024

Rename normalize to subscale_by #223

Open

11 tasks

mjskay closed this as completed Apr 5, 2024

venpopov mentioned this issue Apr 6, 2024

A way to specify defaults for subguide, subsccale and normalize? #225

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

do not limit range of brakes in subguide_axis #219

do not limit range of brakes in subguide_axis #219

venpopov commented Mar 29, 2024

mjskay commented Mar 31, 2024

venpopov commented Mar 31, 2024 •

edited

Loading

mjskay commented Apr 1, 2024

mjskay commented Apr 2, 2024

venpopov commented Apr 2, 2024

mjskay commented Apr 5, 2024

venpopov commented Apr 6, 2024

do not limit range of brakes in subguide_axis #219

do not limit range of brakes in subguide_axis #219

Comments

venpopov commented Mar 29, 2024

mjskay commented Mar 31, 2024

venpopov commented Mar 31, 2024 • edited Loading

mjskay commented Apr 1, 2024

mjskay commented Apr 2, 2024

venpopov commented Apr 2, 2024

mjskay commented Apr 5, 2024

venpopov commented Apr 6, 2024

venpopov commented Mar 31, 2024 •

edited

Loading