Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bubble sizes out of order, when using a formula for the color/name attributes #2346

Open
JElchison opened this issue Apr 6, 2024 · 2 comments

Comments

@JElchison
Copy link

JElchison commented Apr 6, 2024

Hi folks, I'm seeing a strange effect on scatter plot marker sizes, when using a formula for the color (and name) attributes.

(edited to add more test cases)

library(plotly)
#> Loading required package: ggplot2
#> 
#> Attaching package: 'plotly'
#> The following object is masked from 'package:ggplot2':
#> 
#>     last_plot
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following object is masked from 'package:graphics':
#> 
#>     layout

df <- data.frame(x = c(1, 2, 3, 4, 5),
                 y = c(1, 2, 3, 4, 5),
                 z = c(1, 2, 3, 4, 5))

# Expected output: marker sizes in order
plot_ly(df,
        x = ~x,
        y = ~y,
        type = "scatter",
        mode = "markers",
        marker = list(size = ~z,
                      sizeref = 0.1),
        color = ~z < 2,
        colors = c(I("green"), I("red")),
        text = ~paste0("z: ", z))

# df has correct data
df
#>   x y z
#> 1 1 1 1
#> 2 2 2 2
#> 3 3 3 3
#> 4 4 4 4
#> 5 5 5 5

# Buggy output: changing "color" formula threshold puts marker sizes out of order
plot_ly(df,
        x = ~x,
        y = ~y,
        type = "scatter",
        mode = "markers",
        marker = list(size = ~z,
                      sizeref = 0.1),
        color = ~z < 3,
        colors = c(I("green"), I("red")),
        text = ~paste0("z: ", z))

# df still has correct data
df
#>   x y z
#> 1 1 1 1
#> 2 2 2 2
#> 3 3 3 3
#> 4 4 4 4
#> 5 5 5 5

# Buggy output: static vector also puts marker sizes out of order
plot_ly(df,
        x = ~x,
        y = ~y,
        type = "scatter",
        mode = "markers",
        marker = list(size = ~z,
                      sizeref = 0.1),
        color = c(TRUE, TRUE, FALSE, FALSE, FALSE),
        colors = c(I("green"), I("red")),
        text = ~paste0("z: ", z))

# Buggy output: use ifelse with (TRUE, FALSE)
plot_ly(df,
        x = ~x,
        y = ~y,
        type = "scatter",
        mode = "markers",
        marker = list(size = ~z,
                      sizeref = 0.1),
        color = ~ifelse(z < 3, TRUE, FALSE),
        colors = c(I("green"), I("red")),
        text = ~paste0("z: ", z))

# Possible workaround: use ifelse with (1, 0)
plot_ly(df,
        x = ~x,
        y = ~y,
        type = "scatter",
        mode = "markers",
        marker = list(size = ~z,
                      sizeref = 0.1),
        color = ~ifelse(z < 3, 1, 0),
        colors = c(I("green"), I("red")),
        text = ~paste0("z: ", z))

# Buggy again when adding "name" field with formula
plot_ly(df,
        x = ~x,
        y = ~y,
        type = "scatter",
        mode = "markers",
        marker = list(size = ~z,
                      sizeref = 0.1),
        color = ~ifelse(z < 3, 1, 0),
        colors = c(I("green"), I("red")),
        name = ~ifelse(z < 3, "Red", "Green"),
        text = ~paste0("z: ", z))

Created on 2024-04-06 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.3 (2024-02-29)
#>  os       Ubuntu 22.04.4 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/Indiana/Vevay
#>  date     2024-04-06
#>  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  callr         3.7.5   2024-02-19 [1] CRAN (R 4.3.2)
#>  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.2)
#>  colorspace    2.1-0   2023-01-23 [1] CRAN (R 4.3.0)
#>  crosstalk     1.2.1   2023-11-23 [1] CRAN (R 4.3.2)
#>  curl          5.2.0   2023-12-08 [1] CRAN (R 4.3.2)
#>  data.table    1.15.0  2024-01-30 [1] CRAN (R 4.3.2)
#>  digest        0.6.34  2024-01-11 [1] CRAN (R 4.3.2)
#>  dplyr         1.1.4   2023-11-17 [1] CRAN (R 4.3.2)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.3.0)
#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.2)
#>  fansi         1.0.6   2023-12-08 [1] CRAN (R 4.3.2)
#>  farver        2.1.1   2022-07-06 [1] CRAN (R 4.3.0)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.2)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.0)
#>  ggplot2     * 3.5.0   2024-02-23 [1] CRAN (R 4.3.2)
#>  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.2)
#>  gtable        0.3.4   2023-08-21 [1] CRAN (R 4.3.2)
#>  highr         0.10    2022-12-22 [1] CRAN (R 4.3.0)
#>  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.2)
#>  htmlwidgets   1.6.4   2023-12-06 [1] CRAN (R 4.3.2)
#>  httr          1.4.7   2023-08-15 [1] CRAN (R 4.3.2)
#>  jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.3.2)
#>  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.2)
#>  lazyeval      0.2.2   2019-03-15 [1] CRAN (R 4.3.0)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.2)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.3.0)
#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.0)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.0)
#>  plotly      * 4.10.4  2024-01-13 [1] CRAN (R 4.3.2)
#>  processx      3.8.3   2023-12-10 [1] CRAN (R 4.3.2)
#>  ps            1.7.6   2024-01-18 [1] CRAN (R 4.3.2)
#>  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.3.2)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.1)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.1)
#>  R.oo          1.26.0  2024-01-24 [1] CRAN (R 4.3.2)
#>  R.utils       2.12.3  2023-11-18 [1] CRAN (R 4.3.2)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
#>  reprex        2.1.0   2024-01-11 [1] CRAN (R 4.3.3)
#>  rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.2)
#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.2)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.2)
#>  scales        1.3.0   2023-11-28 [1] CRAN (R 4.3.2)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.0)
#>  styler        1.10.2  2023-08-29 [1] CRAN (R 4.3.1)
#>  tibble        3.2.1   2023-03-20 [1] CRAN (R 4.3.0)
#>  tidyr         1.3.1   2024-01-24 [1] CRAN (R 4.3.2)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.0)
#>  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.2)
#>  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.2)
#>  viridisLite   0.4.2   2023-05-02 [1] CRAN (R 4.3.0)
#>  webshot       0.5.5   2023-06-26 [1] CRAN (R 4.3.2)
#>  withr         3.0.0   2024-01-16 [1] CRAN (R 4.3.2)
#>  xfun          0.42    2024-02-08 [1] CRAN (R 4.3.2)
#>  xml2          1.3.6   2023-12-04 [1] CRAN (R 4.3.2)
#>  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.2)
#> 
#>  [1] /home/jonathan/R/x86_64-pc-linux-gnu-library/4.3
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Bad color formulas are:

  • z < 3
  • z < 4
  • z < 5

The results from z < 2 may also be incorrect ... they're just indiscernible given my example.

This behavior causes significant skewing/confusion for bubble plots of any size.

Thanks for reading!

@JElchison JElchison changed the title Bubble sizes out of order, when using a formula for the color attribute Bubble sizes out of order, when using a formula for the color/name attributes Apr 6, 2024
@bklingen
Copy link

bklingen commented Apr 8, 2024

I don't think you should map to variables inside markers. Declare the size mapping "outside", see below, and you get the correct result:

plot_ly(df,
        x = ~x,
        y = ~y,
        type = "scatter",
        mode = "markers",
        marker = list(sizeref = 0.1),
        size = ~z,
        color = ~z < 3,
        colors = c(I("green"), I("red")),
        text = ~paste0("z: ", z))
image

@JElchison
Copy link
Author

Hi @bklingen, thanks for your help!

png mismatch

As an aside, I don't think your png matches your code, because (given your color threshold of z < 3) the success case should show 2 reds on the small side, not 3. However, that's irrelevant to your tip.

Your workaround, with new warnings

But beyond that, it does look like setting the size attribute instead of marker.size successfully works around my issue.

Did you notice that your workaround causes these warnings?

Warning messages:
1: `line.width` does not currently support multiple values. 
2: `line.width` does not currently support multiple values. 

I'm not sure what to make of those, but there are some SO topics related, such as: https://stackoverflow.com/questions/52692760/spurious-warning-when-mapping-marker-size-in-plotly-r

Intended behavior?

More critically, though, I do question is whether this outcome is the intended behavior. Do you have any supporting documentation you could link to, showing why I should use size instead of marker.size?

Here's what I could find:

  1. https://plotly.com/r/reference/scatter/, which states that:

Bubble charts are achieved by setting marker.size and/or marker.color to numerical arrays.

According to the same documentation page, neither size nor color is a parent-level attribute. colors isn't mentioned anywhere (and it produces different behavior from colorscale).

Further, this doesn't seem to line up with the examples at...

  1. https://plotly.com/r/bubble-charts/, where 2 of the 7 examples use size instead of marker.size

Strangely, 7 of 7 examples there use color instead of marker.color. Further, all examples use the undocumented colors (top level) attribute.

Summary -- A bug in code or documentation?

I'm (very) happy to use your workaround, but because the documented code doesn't produce the documented behavior, it seems like either:

  • what I originally reported is a bug in the code
  • the documentation should be updated to be consistent with the code

Any additional thoughts? Thanks!

Workaround in action

Finally, for posterity, here's the functioning workaround, but with 2 warnings:

library(plotly)
#> Loading required package: ggplot2
#> 
#> Attaching package: 'plotly'
#> The following object is masked from 'package:ggplot2':
#> 
#>     last_plot
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following object is masked from 'package:graphics':
#> 
#>     layout

df <- data.frame(x = c(1, 2, 3, 4, 5),
                 y = c(1, 2, 3, 4, 5),
                 z = c(1, 2, 3, 4, 5))

plot_ly(df,
        x = ~x,
        y = ~y,
        type = "scatter",
        mode = "markers",
        marker = list(sizeref = 0.1),
        # `size` is undocumented at this level.  https://plotly.com/r/reference/scatter/ shows `marker.size`
        size = ~z,
        # `color` is undocumented at this level.  https://plotly.com/r/reference/scatter/ shows `marker.color`
        color = ~z < 3,
        # `colors` is undocumented
        colors = c(I("green"), I("red")),
        name = ~ifelse(z < 3, "Red", "Green"),
        text = ~paste0("z: ", z))
#> Warning: `line.width` does not currently support multiple values.

#> Warning: `line.width` does not currently support multiple values.

Created on 2024-04-09 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.3 (2024-02-29)
#>  os       Ubuntu 22.04.4 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/Indiana/Vevay
#>  date     2024-04-09
#>  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  callr         3.7.5   2024-02-19 [1] CRAN (R 4.3.2)
#>  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.2)
#>  colorspace    2.1-0   2023-01-23 [1] CRAN (R 4.3.0)
#>  crosstalk     1.2.1   2023-11-23 [1] CRAN (R 4.3.2)
#>  curl          5.2.0   2023-12-08 [1] CRAN (R 4.3.2)
#>  data.table    1.15.0  2024-01-30 [1] CRAN (R 4.3.2)
#>  digest        0.6.34  2024-01-11 [1] CRAN (R 4.3.2)
#>  dplyr         1.1.4   2023-11-17 [1] CRAN (R 4.3.2)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.3.0)
#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.2)
#>  fansi         1.0.6   2023-12-08 [1] CRAN (R 4.3.2)
#>  farver        2.1.1   2022-07-06 [1] CRAN (R 4.3.0)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
#>  fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.2)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.0)
#>  ggplot2     * 3.5.0   2024-02-23 [1] CRAN (R 4.3.2)
#>  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.2)
#>  gtable        0.3.4   2023-08-21 [1] CRAN (R 4.3.2)
#>  highr         0.10    2022-12-22 [1] CRAN (R 4.3.0)
#>  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.2)
#>  htmlwidgets   1.6.4   2023-12-06 [1] CRAN (R 4.3.2)
#>  httr          1.4.7   2023-08-15 [1] CRAN (R 4.3.2)
#>  jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.3.2)
#>  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.2)
#>  lazyeval      0.2.2   2019-03-15 [1] CRAN (R 4.3.0)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.2)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.3.0)
#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.0)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.0)
#>  plotly      * 4.10.4  2024-01-13 [1] CRAN (R 4.3.3)
#>  processx      3.8.3   2023-12-10 [1] CRAN (R 4.3.2)
#>  ps            1.7.6   2024-01-18 [1] CRAN (R 4.3.2)
#>  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.3.2)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.1)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.1)
#>  R.oo          1.26.0  2024-01-24 [1] CRAN (R 4.3.2)
#>  R.utils       2.12.3  2023-11-18 [1] CRAN (R 4.3.2)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
#>  reprex        2.1.0   2024-01-11 [1] CRAN (R 4.3.3)
#>  rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.2)
#>  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.2)
#>  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.2)
#>  scales        1.3.0   2023-11-28 [1] CRAN (R 4.3.2)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.0)
#>  styler        1.10.2  2023-08-29 [1] CRAN (R 4.3.1)
#>  tibble        3.2.1   2023-03-20 [1] CRAN (R 4.3.0)
#>  tidyr         1.3.1   2024-01-24 [1] CRAN (R 4.3.2)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.0)
#>  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.2)
#>  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.2)
#>  viridisLite   0.4.2   2023-05-02 [1] CRAN (R 4.3.0)
#>  webshot       0.5.5   2023-06-26 [1] CRAN (R 4.3.2)
#>  withr         3.0.0   2024-01-16 [1] CRAN (R 4.3.2)
#>  xfun          0.42    2024-02-08 [1] CRAN (R 4.3.2)
#>  xml2          1.3.6   2023-12-04 [1] CRAN (R 4.3.2)
#>  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.2)
#> 
#>  [1] /home/jonathan/R/x86_64-pc-linux-gnu-library/4.3
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants