This Python project to scrap web pages from https://books.toscrape.com/ and save books informations in CSV files for each category.
This project use :
- Beautiful Soup : Python library for pulling data out of HTML and XML files : https://pypi.org/project/beautifulsoup4/
- Pandas : open source data analysis and manipulation tool, built on top of the Python programming language : https://pandas.pydata.org/
Objective:
- Extract information from the entire site library
Results :
- A CSV file of all the site's book data
- A folder containing all the cover images of the extracted infobooks
Prerequisite :
- Install Python 3.11
To launch program :
- Download the project : https://github.com/saadsabir/PythonWebScraping
- Go to terminal and create a virtual environment : $ python -m venv env
- Activate virtual environment : $ source env/bin/activate
- Install Libraries from requirements file : $ pip install -r requirements.txt
- Execute scripts : $ python script_name.py
By Saad SABIR