Collects posts/pages from a CSV list of Wordpress URLs, spin's them, then prepares them in a JSON file.
This set of scripts is specifically designed to run on:
- Python 3
- Windows 10 (although it should work on Vista, 7 and 8)
- MacOS Monterey
- Install Python for Windows
- From the project root, run
python setup.py
- Add appropriate values to the
.env
file
This is done in 3 parts...
- Compile a list of all URL articles or pages you want to pull content from
- Add CSV file with list of all URLs to the
./sources
folder
- Using
terminal
,bash
,PowerShell
or similar, navigate to./scrapers
- Run
python scrape-press.py
- Wait for the script to finish compiling the JSON file to the
./data
folder
- Install a processor / importer on your blogging platform (if you're using WordPress, WP All Import is brilliant)
- Upload the
./data/____.json
file to the importer - Map the appropriate fields
- Run your importer