Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add info from whoscored #394

Open
jesbrz opened this issue Aug 23, 2024 · 2 comments
Open

Add info from whoscored #394

jesbrz opened this issue Aug 23, 2024 · 2 comments
Labels
enhancement New feature or request for future consideration Issue is only minor and may be addressed in the future

Comments

@jesbrz
Copy link

jesbrz commented Aug 23, 2024

I would like to know if it is possible to add the functions of the website https://www.whoscored.com to worldfootballR. It has interesting information that could be very useful for analysis.

Regards.

@tonyelhabr
Copy link
Collaborator

WhoScored has good data, I agree. However, I just don't see it as too practical. WhoScored loads webpage data on the client side, which means we'd probably need to use something like Selenium to get the data. We've avoided using Selenium in this package for at least 2 reasons:

  1. to simplify dependencies (both package and OS)
  2. to prevent having "frail" code
  • For example, Selenium with R is fairly prone to leaving open connections, which can lead to mysterious OOM errors. This can be avoided with smart error handling, but that puts a lot more responsibility on the package developers to write really robust code. @JaseZiv and I strive to do this, but we're also not spending enough time on package development to guarantee this. (Just look at the package source, and you can see lots of ugly code 😅 !)

If not Selenium, other options are:

I haven't explored these. These may indeed make scraping easy.

@tonyelhabr tonyelhabr added enhancement New feature or request for future consideration Issue is only minor and may be addressed in the future labels Sep 1, 2024
@JaseZiv
Copy link
Owner

JaseZiv commented Sep 2, 2024

Totally echo Tony's statements... have purposefully stayed away from this site due to the somewhat flimsy nature of browser automation scraping.

Happy to leave it for future consideration, but wouldn't imagine this is something we address in the near future unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request for future consideration Issue is only minor and may be addressed in the future
Projects
None yet
Development

No branches or pull requests

3 participants