This guide explains how to bypass CAPTCHAs using Playwright and ensure your web scraping tasks run smoothly without interruptions:
- What Are CAPTCHAs and Can You Bypass Them?
- Playwright Bypass CAPTCHA: Step-By-Step Tutorial
- What If the Playwright CAPTCHA Solver Solution Does Not Work?
A CAPTCHA, short for “Completely Automated Public Turing tests to tell Computers and Humans Apart,” is a test used to distinguish between human users and automated bots. Humans can typically solve them easily, but machines are supposed to find them challenging.
Google reCAPTCHA, hCaptcha, and BotDetect are some of the most popular CAPTCHA providers. These usually support one or more of the CAPTCHA types below:
- Image-based challenges: Users must identify specific objects in a grid of images.
- Text-based challenges: Users are required to type a sequence of distorted letters and numbers.
- Audio-based challenges: Users are asked to type the words they hear.
- Puzzle challenges: Users must solve a simple puzzle, such as sliding a piece into place.
CAPTCHAs can be part of a particular user flow, such as the final step of submitting a form:
In these cases, the CAPTCHA is always displayed and cannot be avoided by bots. However, you can integrate your software with CAPTCHA-solving libraries to automate them or services that rely on human operators to solve these challenges in real-time.
CAPTCHAs are also commonly used as part of broader anti-bot solutions, such as web application firewalls:
These systems dynamically display a CAPTCHA when they suspect the user may be a bot. In these instances, CAPTCHAs can be bypassed by making your bot mimic human behavior.
An eficient approach to avoid CAPTCHAs is to simulate human behaviors in an automated script while using a human-like fingerprint. Playwright is a leading browser automation library and one of the best tools for this purpose is.
The tutorial section of this guide explains how to implement Playwright CAPTCHA bypassing logic.
Skip this step if you already have a Playwright web scraping or testing script. Otherwise, create a folder for your Playwright CAPTCHA solver project and enter it in the terminal:
mkdir playwright_demo
cd playwright_demo
Initialize a new Node.js project inside it:
npm init -y
Open the project’s folder in your preferred JavaScript IDE and add a new script.js
file.
Do not forget to open package.json
and mark your project as a module by adding "type": "module"
.
Playwright's lack of support for plugins has been made up by the community with Playwright Extra.
Add playwright
and playwright-extra
to your project’s dependencies:
npm i playwright playwright-extra
It is time to initialize your script to let Playwright solve CAPTCHA challenges. To import the browser you want to control from playwright-extra
, add this line to script.js
:
import { chromium } from "playwright-extra"
Initialize a new async function where to perform the human-like interaction using the Playwright API:
(async () => {
// set up the browser and launch it
const browser = await chromium.launch()
// open a new blank page
const page = await browser.newPage()
// browser automation logic...
// close the browser and release its resources
await browser.close()
})()
This launches a new Chromium instance and opens a new page before closing the browser.
The target site will be bot.sannysoft.com, a special web page that runs some tests in the browser to find out whether the user is a human or a bot. If you try to visit this page on your local browser, you should see that all the tests are passed.
Connect to the target page using the goto()
method:
await page.goto("https://bot.sannysoft.com/")
Now, create a screenshot of the entire page to see the results of the anti-bot tests:
await page.screenshot("results.png")
Put it all together, and you will get the following script.js
file:
import { chromium } from "playwright-extra"
(async () => {
// set up the browser and launch it
const browser = await chromium.launch()
// open a new blank page
const page = await browser.newPage()
// navigate to the target page
await page.goto("https://bot.sannysoft.com/")
// take a screenshot of the entire page
await page.screenshot({
path: "results.png",
fullPage: true
})
// close the browser and release its resources
await browser.close()
})()
Execute the above code with the command below:
node script.js
The script will open a Chromium instance in headless mode, visit the desired page, take a screenshot, and then close the browser. If you open the results.png
file that will appear in the project root folder at the end of script execution, you will see:
Vanilla Playwright in headless mode does not pass several tests. To fix it, use the Stealth plugin.
Playwright Stealth is a plugin for playwright-extra
to prevent bot detection. It overrides several configurations to make the browser instance appear to be natural, as if it was not being controlled by Playwright.
The Stealth plugin was originally developed for Puppeteer Extra, but it also works for Playwright Extra. Install it with npm:
npm i puppeteer-extra-plugin-stealth
Next, import the Stealth plugin in your script.js
file with this line:
import StealthPlugin from "puppeteer-extra-plugin-stealth"
To implement Playwright CAPCHA bypass logic, simply register the Stealth plugin in playwright-extra
through the use()
method:
chromium.use(StealthPlugin())
The browser controlled by Playwright will now appear as a real-world browser in use by a human user.
Here is what your script.js
file should currently look like:
import { chromium } from "playwright-extra"
import StealthPlugin from "puppeteer-extra-plugin-stealth"
(async () => {
// register the Stealth plugin
chromium.use(StealthPlugin())
// set up the browser and launch it
const browser = await chromium.launch()
// open a new blank page
const page = await browser.newPage()
// navigate to the target page
await page.goto("https://bot.sannysoft.com/")
// take a screenshot of the entire page
await page.screenshot({
path: "results.png",
fullPage: true
})
// close the browser and release its resources
await browser.close()
})()
Launch the script again:
node script.js
Open results.png
another time, and you will now see that all bot-detection tests have been passed:
Browser settings are not the only aspect that anti-bot tools focus their attention on. IP reputation is another key factor, and a free library doesn't help with that.
For simple CAPTCHAs that require only a single click, you can use the puppeteer-extra-plugin-recaptcha
plugin. However, when dealing with more complex tools like Cloudflare, you need something more powerful.
If you are looking for a real Playwright CAPTCHA solver, try Bright Data web scraping solutions. These provide superior unlocking capabilities with a dedicated CAPTCHA-solving feature to automatically handle reCAPTCHA, hCaptcha, Cloudflare Turnstile, AWS WAF Captcha, and many others.
Register now and start your free trial today.