Skip to content

Latest commit

 

History

History

WebScraper

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Article scraper

Simple and quick python script to scrape any blog, article, news, etc. website and retrieve in raw text its contents. The output gets printed to stdout.

Usage:

python3 web_scraper.py https://articleURL

Dependencies:

Newspaper3k -> PIP Newspaper3k already "bundles" the necessary tools like NLTK, BS4, etc.