-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[crawler] javlib数据抓取异常(无法绕开反爬措施) #16
Comments
这算是一个已知问题,主要是由于网站的反爬策略引起的。javlib和javdb套了CloudFlare保护,目前的解决方案是使用cloudscraper这个模块绕过反爬,但是cloudscraper的开源版本只提供基础的绕过,并不能完全绕开反爬(Cloudflare version 2 challenge),所以会有这样的问题。 |
谢谢。不过我观察到,当爬取过程中出现错误的同时,用浏览器(Chrome/Edge)访问的话却并没有captcha challenge,是否可以套一个浏览器agent再抓? |
开源版的cloudscraper就已经内置了很多user agent,但是还是无法完全通过验证。所以单纯的套agent没有办法解决问题。 |
可以使用 playwright 模拟绕过~ |
Cloudflare version 2可以用cloudscraper的收费3rd Party Captcha Solvers,我用过2captcha,当时想移植过来但是感觉挺麻烦的,所以作罢了,这个问题感觉目前无解,javlib站点自我防御开得有点高 reference:https://pypi.org/project/cloudscraper/ |
release 0.5,所有影片抓取javlib均提示该错误:
是否使用代理
无,人在境外
The text was updated successfully, but these errors were encountered: