Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dotplot: Number of dots for axis breaks #182

Closed
fweber144 opened this issue May 25, 2023 · 8 comments
Closed

Dotplot: Number of dots for axis breaks #182

fweber144 opened this issue May 25, 2023 · 8 comments

Comments

@fweber144
Copy link

In a (one-sided) dotplot, is it possible to show the number of dots on the "density" axis (y-axis by default)?

This might already be supported, but I couldn't find a way to achieve this (and I also couldn't find an older Github issue where this was mentioned).

Example (from ?geom_dots):

library(dplyr)
library(ggplot2)
data(RankCorr_u_tau, package = "ggdist")
RankCorr_u_tau %>%
  ggplot(aes(x = u_tau)) +
  geom_dots()

gives
Rplot
where the y-axis has an upper limit of 1. I know that there's the height aesthetic, but currently I don't see how fixing it to some specific value (possibly in combination with the scale aesthetic) could be used for achieving the number of dots on the y-axis.

@mjskay
Copy link
Owner

mjskay commented May 25, 2023

Ah yeah, so the short answer is "not with geom_dots b/c of limitations in ggplot2" and the long answer is "yes, if you do a bunch of stuff manually".

The problem with this is that in order to set the y axis scale to match the dot count, you need to know the binwidth, but in order to know the binwidth, you need to know the physical plot dimensions (to pick a binwidth so that the tallest stack fits inside the plot). However, y axis scales are set before plot dimensions are known in ggplot, and it is not possible to change them after plot dimensions are known (this happens much later in the pipeline). So, it cannot be done within a single geom like geom_dots, and this is fundamental limitation of dotplots in ggplot2 (there are old stackoverflow questions about this problem with ggplot2::geom_dotplot, for example, which has the same problem).

That said, I don't know what your use case is, but if you're willing to do some stuff manually, it can theoretically be done. {ggdist} exposes a few functions, find_dotplot_binwidth and bin_dots, for building dotplots manually without a geom. In theory you can use these to pick a binwidth and perform the binning, then use that output to construct a chart where you enforce a fixed ratio between the scale of the x and y axes using coord_fixed (which you need to solve this problem). Something like this:

set.seed(1234)
x = rnorm(100)

# determine the binwidth
# you could also skip this step and manually specify a binwidth... maxheight 
# here is the max height of the chart assuming y units and x units are square,
# and is intended to get a chart with around a 3/2 aspect ratio 
binwidth = find_dotplot_binwidth(x, maxheight = 2/3*diff(range(x)), heightratio = 1)

# bin the dots
bin_df = bin_dots(x = x, y = 0, binwidth = binwidth, heightratio = 1)

bin_df %>%
  ggplot(aes(x, y/binwidth)) +
  geom_point() +
  coord_fixed(ratio = binwidth)

image

Now if we want the dots to actually be ellipses with a desired width/height in data space, we could use ggforce::geom_ellipse; something like:

bin_df %>%
  ggplot(aes(x0 = x, y0 = y / binwidth, a = binwidth/2, b = 1/2, angle = 0)) +
  ggforce::geom_ellipse() +
  coord_fixed(ratio = binwidth)

image

@fweber144
Copy link
Author

Thanks! For my current use case, I might be able to get along with the [0, 1] scaling of the y-axis, but for the future, I'll think about adopting your customized suggestion. Would it make sense to add that customized suggestion to ggdist (probably as some new function, with the remark that this new function is very limited compared to geom_dots(), but that it achieves the number of dots on the y-axis)?

@fweber144
Copy link
Author

Since such a new function would create a ggplot object, I guess it would not be a geom_ or stat_ function, but something more in the spirit of ggplot2::qplot().

@mjskay
Copy link
Owner

mjskay commented May 26, 2023

Would it make sense to add that customized suggestion to ggdist (probably as some new function, with the remark that this new function is very limited compared to geom_dots(), but that it achieves the number of dots on the y-axis)?

Hmm, to be honest I'd want a good use case before doing that --- I don't see a big need for the y axis labels on dotplots myself, and the version with y axis labels is just a lot less flexible and a lot more brittle (requires fixed coords, doesn't easily support conditioning on other aesthetics, mapping other things to the y axis to stack multiple dotplots, etc etc). I generally prefer providing building blocks for charts rather than full-on charts as output since the latter tend to be less customizable (or, to achieve a lot of customizability, lead to recreating a lot of ggplot functionality). So at the moment I think the most sensible approach is to update the example in the docs for find_dotplot_binwidth() to show how to do this with the pieces that are already there (the example there is already pretty close).

@mjskay
Copy link
Owner

mjskay commented May 26, 2023

In fact, now that I think about it a better solution here would be to provide layers that can be added to a chart that provide labels for the thickness positional subscale used by slabs and dots geoms... that actually seems like a potentially viable solution that solves the problem more cleanly and would allow use with geom_dots. Plus it would handle the case of multiple slabs/dots stacked in rows or columns.

@fweber144
Copy link
Author

Would it make sense to add that customized suggestion to ggdist (probably as some new function, with the remark that this new function is very limited compared to geom_dots(), but that it achieves the number of dots on the y-axis)?

Hmm, to be honest I'd want a good use case before doing that --- I don't see a big need for the y axis labels on dotplots myself, and the version with y axis labels is just a lot less flexible and a lot more brittle (requires fixed coords, doesn't easily support conditioning on other aesthetics, mapping other things to the y axis to stack multiple dotplots, etc etc). I generally prefer providing building blocks for charts rather than full-on charts as output since the latter tend to be less customizable (or, to achieve a lot of customizability, lead to recreating a lot of ggplot functionality). So at the moment I think the most sensible approach is to update the example in the docs for find_dotplot_binwidth() to show how to do this with the pieces that are already there (the example there is already pretty close).

Yes, I totally understand.

@fweber144
Copy link
Author

In fact, now that I think about it a better solution here would be to provide layers that can be added to a chart that provide labels for the thickness positional subscale used by slabs and dots geoms... that actually seems like a potentially viable solution that solves the problem more cleanly and would allow use with geom_dots. Plus it would handle the case of multiple slabs/dots stacked in rows or columns.

Ok, I can't follow you here, but if you say this would be a more general solution, that sounds good.

@mjskay
Copy link
Owner

mjskay commented Jan 12, 2024

There is now a prototype implementation of a solution to this on the subguide branch. See this comment: #183 (comment)

Since that issue now supercedes this one, I am closing this. Follow #183 for updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants