Skip to content

croqaz/awesome-scrapy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

Scrapy Awesomeness Awesome list

Scrapy is a free application framework for crawling web sites and extracting structured data which can be used for a wide range of useful applications, like data mining, information processing, or historical archival.

The Scrapy documentation can be found at scrapy.readthedocs.io.

To understand what these things mean, take a look at the Scrapy Architecture Overview.

Spiders

Item pipelines

Schedulers

Downloader middlewares

Extensions

Helpful libraries

Frameworks

  • AutoExtract - AI Enabled Automatic Data Extraction. E-commerce and Article extraction at scale.
  • Portia - Visually scrape websites without any programming knowledge required. Annotate a web page to identify the data you wish to extract, Portia will understand based on these annotations how to scrape data from similar pages.
  • Scrapely - Library for extracting structured data from HTML pages. Given some example web pages and the data to be extracted, Scrapely constructs a parser for all similar pages.

External tools

Related Lists

Scrapy examples


License

Unlicensed aka Public Domain. See UNLICENSE for details.

Releases

No releases published

Packages

No packages published