This program fetches popular stories from Hacker News (HN) and displays them based on a minimum vote count. It highlights stories with more than 200 votes, making it easy to find trending content.
- Scrapes stories from the Hacker News homepage.
- Filters stories by vote count (only shows stories with more than 200 votes).
- Combines stories from the first two pages of Hacker News.
- Displays each story's title, vote count, and URL.
- Python 3.x
- Requests library
- BeautifulSoup4 library
You can install the required libraries using pip:
pip3 install requests beautifulsoup4
- Clone the repository:
git clone https://github.com/cainepavl/DataScraping.git
- Navigate to the project directory:
cd DataScraping
To run the program, use the following command:
python3 news.py
-
Clear Screen: The program clears the console to provide a cleaner output.
-
Fetching Stories: It uses the requests library to fetch HTML from Hacker News and BeautifulSoup to parse the HTML.
-
Extracting Links and Votes: It extracts story titles, URLs, and vote counts from the HTML.
-
Filtering Stories: Only stories with more than 200 votes are included in the final output.
-
Displaying Results: Finally, it prints the filtered stories, including their title, vote count, and URL.
This project is licensed under the MIT License - see the LICENSE file for details.
-
Requests : A simple and elegant HTTP library for Python, which makes sending HTTP/1.1 requests easy.
-
BeautifulSoup : A library for parsing HTML and XML documents, making it easier to extract data from web pages.
-
Hacker News : The source of the stories and votes, providing a platform for sharing and discussing tech news.
-
ZTM Mastery : For the lesson teaching this project and proving the base code.
If you have any questions, feel free to contact me at [email protected]