Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yandex Music has a captcha #2071

Closed
4 tasks done
00-kat opened this issue Apr 9, 2024 · 15 comments · Fixed by #2068
Closed
4 tasks done

Yandex Music has a captcha #2071

00-kat opened this issue Apr 9, 2024 · 15 comments · Fixed by #2068
Labels
false positive A site is responding with false positives

Comments

@00-kat
Copy link

00-kat commented Apr 9, 2024

Checklist

  • I'm reporting a website that is returning false positive results
  • I've checked for similar site support requests including closed ones
  • I've checked for pull requests attempting to fix this false positive
  • I'm only reporting one site (create a separate issue for each site)

Description

Here's a random username that can't possibly exist: ecfhlmiuewfimcuhem.

Here's the username from data.json: ya.playlist

When I visit either, I get a captcha (note: JS is disabled in my browser):
image

Unless Sherlock uses Selenium/Pyppeteer, which i highly doubt (it's not in requirements.txt), this captcha isn't really avoidable (I think). Maybe it even shows up with JS enabled, which I didn't check.

I'm not opening a PR removing YandexMusic because it could be an issue that only happens for me, or maybe it's possible to bypass this captcha.

@00-kat 00-kat added the false positive A site is responding with false positives label Apr 9, 2024
@ppfeister
Copy link
Member

ppfeister commented Apr 9, 2024

@cd-CreepArghhh Can you share the raw html used for that page? I'll likely be able to add it to #2068

It won't bypass the captcha until circumvention is added, but it would avoid F+ hits due to the captcha when it's presented

@00-kat
Copy link
Author

00-kat commented Apr 9, 2024

Huh, interestingly there's no captcha now (so it's not a JS issue) but there's a 404 page and a profile. Maybe I'll run Sherlock a couple times then try again.

@ppfeister
Copy link
Member

If you do end up hitting it again drop a ping

Testing yandex in a PITA on my end having to use vpns and such, and even when I do, it apparently trusts me implicitly and refuses to rate limit or captcha me

@ppfeister
Copy link
Member

(if the captcha page returns a status code other than 200, we can also use that as a simpler resolution)

@00-kat
Copy link
Author

00-kat commented Apr 9, 2024

Okay, found out that spamming them with requests gets you a captcha fast. Running Sherlock 4 times resulted in one captcha, and my browser got 2 in 6 requests.

You're going to have to run the HTML through some prettifier though (I don't know any) since it's all on one line.

Note: Github won't let me upload .html files, so rename the .txt to a .html, thanks.

Oops, Captcha!.txt
Oops, Captcha!_files.zip

I'll spam a few requests with python now to check the status code.

Edit: the captcha page (some long URL with a hash or Base64 string in it) returns 200, I'll see what I get when redirected from the profile page (probably 200, so don't wait for me to finish).

@00-kat
Copy link
Author

00-kat commented Apr 9, 2024

Finished. Out of 100 requests, the first request was a 404 (i.e. no captcha) then the rest were all 200s (thus captcha). No 302s either I think, since IIRC requests doesn't automatically resolve those. Status code isn't going to be of any use.

@ppfeister
Copy link
Member

Gonna push a hopeful fix. If you want to be added as a co-author you can drop your github no-reply email/other github email here and a name. Or link to somewhere that has it.

Otherwise I'll push as a single committer.

@00-kat
Copy link
Author

00-kat commented Apr 9, 2024

Just push as single committer

@ppfeister
Copy link
Member

Done. Seems to have not broken anything on my end -- can you pull and validate all 3 cases as well

(captcha, valid, not valid)

@ppfeister
Copy link
Member

ppfeister commented Apr 9, 2024

Just realized I forgot a case --- 'not valid in country'. Will add that now. Shouldn't make a difference for the captcha tests.

Edit::: that's actually accounted for by the 404 msg I added, so we're good

@00-kat
Copy link
Author

00-kat commented Apr 9, 2024

I don't think it worked, since there's still a false-positive. By the way, I'm pretty sure I'm still in the blacklist or whatever Yandex Music has going on, so it will be a while before I can test the other two cases.

$ git clone https://github.com/ppfeister/sherlock.git  # hope I cloned the right repo...
$ cd sherlock
$ python sherlock ecfhlmiuewfimcuhem --site YandexMusic
[*] Checking username ecfhlmiuewfimcuhem on:
[+] YandexMusic: https://music.yandex/users/ecfhlmiuewfimcuhem/playlists

[*] Search completed with 1 results

@ppfeister
Copy link
Member

hm......... lemme re eval and get back

@ppfeister
Copy link
Member

ppfeister commented Apr 9, 2024

@cd-CreepArghhh Just got back

Noticed that you didn't run with the --local flag. When you don't use this flag, it pulls from the repo by default instead of our local patched data.json. Can you test one more time but while using that flag? (this won't be necessary if the patch gets merged upstream)

When using that flag on my end, it seems to give the expected result for each of the four cases (not valid, valid, captcha, geoblock).

(that flag messes with me quite a bit.....)

Edit: you do not need to re-pull unless it's been deleted

@00-kat
Copy link
Author

00-kat commented Apr 9, 2024

Yay, it works! ecfhlmiuewfimcuhem doesn't show up, ya.playlist does, and I didn't get any false positives even after spamming the command 30+ times. I didn't realise that it grabbed a data.json from GitHub instead of the local one by default (probably so you don't need to git pull as often).

Also, I'm not sure what the geoblock case is so I can't really test that. (I assume I could try running it through a bunch of tor nodes until I hit it, but I don't have time for that right now).

@ppfeister
Copy link
Member

ppfeister commented Apr 9, 2024

I get geoblocked here in the USA, so it was an easy test for me to run, lol

I'll go ahead and link your Issue to that PR so it gets closed when and if it (hopefully) gets merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
false positive A site is responding with false positives
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants