Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not limit range of brakes in subguide_axis #219

Closed
venpopov opened this issue Mar 29, 2024 · 7 comments
Closed

do not limit range of brakes in subguide_axis #219

venpopov opened this issue Mar 29, 2024 · 7 comments

Comments

@venpopov
Copy link

Based on the discussion in #183, I've been trying to plot the density of distributions showing the density on the y axis. The following example works well, but I find it aesthetically displeasing that the breaks get cut-off - the maximum break showed is 0.3, because the max density value is ~0.39. In standard ggplot this is not limited in such a way and you have more control over the breaks. Even if I supply the breaks = seq(0,0.4,0.1), the top break still gets cut-off.

library(ggplot2)
library(ggdist)
library(dplyr)
data.frame(prior = c("normal(0,1)", "normal(0,2)")) %>% 
  parse_dist(prior) %>% 
  ggplot(aes(xdist = .dist_obj, color = prior)) +
  stat_slab(fill = NA, subguide = subguide_outside(title = "density", 
            breaks = seq(0,1,0.1))) +
  ylab(NULL) +
  theme_ggdist() +
  theme(plot.margin = margin(5.5, 5.5, 5.5, 50),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank())

image

Created on 2024-03-29 with reprex v2.1.0

This is caused by the following line in subguide_axis():

breaks = oob_discard(scale$get_breaks(), limits)

removing this line gives me what I want:

image

My suggestion is either to get rid of this line, or if you need it for other functionality, to add an argument to subguide_* functions to disable it

@venpopov venpopov changed the title do not limit range of braks in subguide_axis do not limit range of brakes in subguide_axis Mar 29, 2024
@mjskay
Copy link
Owner

mjskay commented Mar 31, 2024

This filtering is necessary because breaks are not guaranteed to be within the limits, and otherwise ticks get drawn outside of the geom bounds in many common situations where default breaks functions try to return "nice-looking" limits.

Base ggplot2 scales do the same thing; e.g.:

ggplot() + 
  geom_point(aes(x = 1:5, y = 1:5)) + 
  scale_x_continuous(breaks = -10:10)

image

To adjust the limits manually, you can set limits on scale_thickness_shared(); e.g.:

library(ggplot2)
library(ggdist)
library(dplyr)
data.frame(prior = c("normal(0,1)", "normal(0,2)")) %>% 
  parse_dist(prior) %>% 
  ggplot(aes(xdist = .dist_obj, color = prior)) +
  stat_slab(fill = NA, subguide = subguide_outside(title = "density", 
            breaks = seq(0,1,0.1))) +
  scale_thickness_shared(limits = c(0,0.4)) +
  ylab(NULL) +
  theme_ggdist() +
  theme(plot.margin = margin(5.5, 5.5, 5.5, 50),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank())

image

@mjskay mjskay closed this as completed Mar 31, 2024
@venpopov
Copy link
Author

venpopov commented Mar 31, 2024

Unfortunately that's not a great solution for the following reasons:

  1. This requires that I know beforehand the right upper limit. I'm building this as part of a package to have a convenient function to visualize our priors. To use this, I would have to separately calculate the maximum value of all the pdfs. While doable, it will add a lot of extra code and computation. This is because this is a hard fixed limit (e.g. limits = c(0,1)):

image

  1. We actually use facets by default with normalize="group", because the scales of many priors could be vastly different. So a more typical use case would be, where the suggestion you gave doesn't work (because it forces the same scale; with scale_thickness_identity it doesn't work)
data.frame(prior = c("student_t(3,2,0.75)", "normal(0,5)")) %>% 
  parse_dist(prior) %>% 
  ggplot(aes(xdist = .dist_obj)) +
  stat_slab(fill = NA, color = "red",
            subguide = subguide_outside(title = "density"),
            normalize = 'groups',
            breaks = seq(0,1,0.1)) +
  scale_thickness_identity(limits = c(0,0.7)) +
  ylab(NULL) +
  theme_ggdist() +
  theme(plot.margin = margin(5.5, 5.5, 5.5, 50),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank()) +
  facet_wrap(~prior, scales = "free")

image

These two issues unfortunately prevent me from using the solution you suggested. However, if I manually change the line

breaks = oob_discard(scale$get_breaks(), limits)

to

breaks = scale$get_breaks()

I get a much nicer result, without having to specify anything and just using the defaults:

### with manually disabling the source code as described above
data.frame(prior = c("student_t(3,2,0.75)", "normal(0,5)")) %>% 
  parse_dist(prior) %>% 
  ggplot(aes(xdist = .dist_obj)) +
  stat_slab(fill = NA, color = "red",
            subguide = subguide_outside(title = "density"),
            normalize = 'groups') +
  ylab(NULL) +
  theme_ggdist() +
  theme(plot.margin = margin(5.5, 5.5, 5.5, 50),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank()) +
  facet_wrap(~prior, scales = "free")

image

This is because scale$get_breaks() produces in this case a vector like [0.1, 0.2, 0.3, 0.4, 0.5], but the upper limit is 0.48, so oob_discard just removes that and leaves a lot of empty space on top of the scale.

If you say disabling this will break other things, maybe you could consider making a user option ot disable it?


The example you give about the default ggplot behavior is actually closer to what I want. Consider this slightly adjusted code:

ggplot() + 
  geom_point(aes(x = 1:5, y = c(1,2,3,4,4.9)))

image

This does not cut-off the break at 5, because the maximum y value is less than that. But in subguide_axis it does.

It seems to me that scale$get_breaks() already determines a good range, why is it necessary to truncate it further?

@mjskay
Copy link
Owner

mjskay commented Apr 1, 2024

I'm building this as part of a package to have a convenient function to visualize our priors.

Cool! This would be a great thing to have, so I definitely want to figure out something that works. Re-opening.

If you say disabling this will break other things, maybe you could consider making a user option ot disable it?

Heh believe me, it will ;). In fact, this line used to be exactly what you're asking me to change it to. The problem is that because many breaks functions may return values outside of the limits in order to make "nice-looking" breaks (and these values can sometimes be surprisingly far from the limits), removing this line means that axis labels of adjacent slabs (e.g. when using a categorical axis) can end up overlapping. It is a brittle solution that breaks the core abstraction of sub-axes with defined limits, potentially leading to unexpected behavior, like axis labels drawn outside the bounds of the chart (not just the geom). So, I don't think this is the right solution.

The example you give about the default ggplot behavior is actually closer to what I want. Consider this slightly adjusted code:

What's happening in this example is not that ggplot is adjusting limits based on the breaks, it's that the default expansion parameter of continuous positional scales happens to be large enough that the break outside of the limits is still drawn. You can see that if you don't expand the axis limits:

ggplot() + 
    geom_point(aes(x = 1:5, y = c(1,2,3,4,4.9))) +
    scale_y_continuous(expand = expansion(0))

image

Fundamentally, the problem is that limits should not be determined by the breaks --- by the time breaks come into play, limits have been determined much earlier in the pipeline. If you want the scale to show more breaks, you need a way to expand the limits such that those breaks will be visible.

To me this suggests a few possibilities:

  • something like the expand argument, which would allow expansion of the top limit of the scale by either a fixed offset or a multiple of the scale range.
  • providing something like the limits argument to scale_ functions, but at a geom-level. It's worth pointing out that because limits in scales_ functions does not need to be fixed (it can be a function of the data that returns appropriate limits), if you want to set limits using your own logic, you should not need to pre-calculate them, you can instead create a function to pass to limits that does what you need.

More generally, I think the right solution might be to allow people to pass their own scale function to slab geometries that allows customization of both limits and expand arguments. Such a scale would then be applied within the groups determined by the normalize parameter. This might be the best solution long-term, though it would require ripping some stuff out and re-organizing it.

@mjskay mjskay reopened this Apr 1, 2024
@mjskay
Copy link
Owner

mjskay commented Apr 2, 2024

Playing with a way to fix this by exposing greater control of sub-scales. Something like this mimics the default position scales by expanding the upper thickness limit by 5%:

ggplot() + 
  stat_slab(
    aes(xdist = dist_normal()), 
    subguide = subguide_inside(title = "density"), 
    subscale = subscale_thickness(expand = expansion(c(0, 0.05)))
  )

image

Check out the subscale branch if you want to play with it. It's very draft-y at the moment and I'd be happy to iterate on it if it doesn't do what you need.

Note that, along with this I am considering renaming the normalize arg to something like subscale_by, which I think better communicates its intent. If I do that it will probably cause deprecation warnings in your package if you are setting the normalize parameter, so I'll let you know if I decide to do that.

@venpopov
Copy link
Author

venpopov commented Apr 2, 2024

Cool, thanks for working on that! I'll try to take a look in the next day or two. I just finished the derivatives PR for distributional so I should have some time to get back to this

@mjskay
Copy link
Owner

mjskay commented Apr 5, 2024

Did a bit more tweaking and I'm happy with this for now, so it is merged to master now. I'm closing this, but don't hesitate to re-open if it's not solving your problem; I'm happy to iterate.

I also decided I am likely to rename normalize; if you want to follow that issue in case it affects you it is here: #223

@mjskay mjskay closed this as completed Apr 5, 2024
@venpopov
Copy link
Author

venpopov commented Apr 6, 2024

Can confirm this solves my problem. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants