Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skater crashing R #43

Open
eestefaniasalazar opened this issue Jul 8, 2023 · 11 comments
Open

Skater crashing R #43

eestefaniasalazar opened this issue Jul 8, 2023 · 11 comments

Comments

@eestefaniasalazar
Copy link

eestefaniasalazar commented Jul 8, 2023

I updated to the latest version (0.10.4) because the "maxp_greedy" function was crashing R (see issue #39), but it seems that the bug wasn´t corrected for the "skater" function.

@ashirwad
Copy link

ashirwad commented Dec 8, 2023

Hey @lixun910, is there an ETA on this?

@lixun910
Copy link
Member

lixun910 commented Dec 8, 2023

Which OS did you use? The skater seems working fine on my MacOS… Thanks!

@ashirwad
Copy link

ashirwad commented Dec 8, 2023

I am using RStudio Server on Ubuntu 20.04. Interestingly, things work as expected when I reduce the number of rows in the data! Is there a limit on how much data rgeoda::skater function can handle? The total observations that I have is ~1800.

@lixun910
Copy link
Member

lixun910 commented Dec 9, 2023

There is no limitation of the data size. I think it maybe other things causing the crash, like invalid values or connectivity structure. Is it possible to share your data and steps with me to replicate? Thanks!

@ashirwad
Copy link

@lixun910, here's a reprex:

# tigris version 2.0.1
# rgeoda version 0.0.10.4
# dplyr version 1.1.1
# sf version 1.0.12
set.seed(100)
ca_zctas <- tigris::zctas(year = 2010, state = "CA") |>
  dplyr::mutate(value = rexp(dplyr::n()))
ca_queen_w <- rgeoda::queen_weights(ca_zctas)
ca_zcta_clusters <- rgeoda::skater(
  5, ca_queen_w, dplyr::select(ca_zctas, value)
)
ca_zcta_clusters

Running the above code block causes R to crash! Also, here's the session info:

─ Session info ───────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.3 (2023-03-15)
 os       Ubuntu 20.04.5 LTS
 system   x86_64, linux-gnu
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/Chicago
 date     2023-12-11
 rstudio  2023.03.0+386 Cherry Blossom (server)
 pandoc   3.1.2 @ /usr/bin/ (via rmarkdown)

@ashirwad
Copy link

@lixun910, when do you anticipate this will get fixed? Just curious!

@lixun910
Copy link
Member

Thanks for checking @ashirwad! I checked your data, and noticed that the connectivity of the queen weights is incomplete since there are many islands in this dataset. We should give a warning instead of a hard crash. Instead, you can try to use e.g. KNN weights in SKATER. I will fix this hard crash in next release. Will keep you updated.

@ashirwad
Copy link

Thanks, @lixun910, for the advice! I will try using KNN weights.

@ashirwad
Copy link

@lixun910, is there a rule of thumb for selecting the value for k in KNN weights, or is it arbitrary?

@lixun910
Copy link
Member

lixun910 commented Dec 14, 2023 via email

@ashirwad
Copy link

@lixun910, thanks for the ideas!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants