WordPress Content Scraper

Collects posts/pages from a CSV list of Wordpress URLs, spin's them, then prepares them in a JSON file.

Requirements

This set of scripts is specifically designed to run on:

Python 3
Windows 10 (although it should work on Vista, 7 and 8)
MacOS Monterey

Setup

Install Python for Windows
From the project root, run python setup.py
Add appropriate values to the .env file

Running the "application"

This is done in 3 parts...

1. Download the articles

Compile a list of all URL articles or pages you want to pull content from
Add CSV file with list of all URLs to the ./sources folder

2. Spin and compile the articles

Using terminal, bash, PowerShell or similar, navigate to ./scrapers
Run python scrape-press.py
Wait for the script to finish compiling the JSON file to the ./data folder

2. Import to your blog

Install a processor / importer on your blogging platform (if you're using WordPress, WP All Import is brilliant)
Upload the ./data/____.json file to the importer
Map the appropriate fields
Run your importer

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
drivers		drivers
scrapers		scrapers
sources		sources
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WordPress Content Scraper

Requirements

Setup

Running the "application"

1. Download the articles

2. Spin and compile the articles

2. Import to your blog

About

Releases

Packages

Languages

baberparweez/scrape-press

Folders and files

Latest commit

History

Repository files navigation

WordPress Content Scraper

Requirements

Setup

Running the "application"

1. Download the articles

2. Spin and compile the articles

2. Import to your blog

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages