Skip to content

tb0hdan/collyresponsible

Repository files navigation

colly-responsible

Responsible crawling with Colly. For the better Internet.

Based on lessons learned while writing Idun and subsequently getting banned by half of the website operators...

Supported limits

  • HTTP status code 429
  • HREF REL NOFOLLOW
  • robots.txt
  • actual delay between requests
  • URL tests (i.e. extension, domain, etc.)
  • Max run time

About

Responsible crawling with Colly

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages