index.js
has the main content of the Cloudflare Workers script.
To run the Cloudflare Workers script you need to create a Cloudflare/Workers account.
Then you will have to pay for the Workers paid plan which is about $5 a month (this unlocks more CPU time which is needed for scraping).
After getting a paid plan you will have to install a CLI tool to deploy your Cloudflare Workers script, in this case we are going to be using wrangler which lets us generate, configure, build, preview, publish our Cloudflare Workers script. You can use npm to install wrangler or yarn.
npm install -g @cloudflare/wrangler
yarn global add @cloudflare/wrangler
To verify you have installed it successfully you can then type wrangler --version
to verify its installed successfully.
After we will have to login to our Cloudflare account on wrangler so it can get our API token to manage our Cloudflare Worker.
Type the command wrangler login
and you should get an option to login via your browser, after loggin in you will be asked to authorize the API key for wrangler.
Now you can git clone my repo and run wrangler dev
to run the script in development mode which will give you logs in your terminal:
wrangler dev
There is more functions to wrangler which you can find out about below:
-Wrangler GitHub repo
-Wrangler further documentation with examples
To test the Cloudflare Workers script I suggest using something like Postman which is an easy to use API dev tool that lets you send requests very easily for testing purposes.
We will start by creating a request by clicking the +
symbol:
Then we will setup the request by putting in the following URL in the box http://127.0.0.1:8787
(this is our localhost listening URL for our dev mode). Also we will be putting in our header as Content-type: application/json
so that our script can process our request with the values we send in the JSON.
Then in the body section we have 4 types of request you can make:
-Mode (scrape
will use our presets for sites, parse
will just grab the site and output it in plain HTML)
-Site (ebay
will just scrape ebay search titles/prices/item links, ebay_extend
will do the same as the previous one but also get the item's condition/seller's name/seller's profile link, amazon
will just scrape amazon search titles/prices)
-Url (You can put any URL and it will parse the plain HTML code)