Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebrowser-puppeteer ironically fails to load rebrowser's own bot test page with CloudFlare error 1010 #75

Open
webhype opened this issue Dec 15, 2024 · 3 comments

Comments

@webhype
Copy link

webhype commented Dec 15, 2024

When launching with puppeteer-extra as per the instructions, and using "real" Chrome for Mac OS 131.0.6778.140 (Official Build) (arm64), https://bot-detector.rebrowser.net/ fails to load with the dreaded CloudFlare error 1010 (as does any CF-protected site such as https://bot.sannysoft.com/ ).

import chalk from "chalk";
import util from "util";
const setTimeoutPromise = util.promisify(setTimeout);

import {addExtra} from "puppeteer-extra";
import StealthPlugin from "puppeteer-extra-plugin-stealth";

import rebrowserPuppeteer from "rebrowser-puppeteer-core";
const Puppeteer = addExtra(rebrowserPuppeteer as any);

const urls = [
	"https://ipleak.net/",
	"https://bot-detector.rebrowser.net/",
	"https://bot.sannysoft.com/",
	"https://www.browserscan.net/bot-detection/",
	"https://antcpt.com/eng/information/demo-form/recaptcha-3-test-score.html",
];
const userAgent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36";

Puppeteer.use(StealthPlugin());

process.on("uncaughtException", (err) => {
	console.log("ERROR: UNCAUGHT EXCEPTION:", err);
});

process.on("unhandledRejection", async (err, promise) => {
	console.log("ERROR: UNHANDLED REJECTION:", err);
});

async function main() {
	let browser;
	try {
	
		const args = [
			"--disable-web-security",
			"--no-sandbox",
			"--disable-blink-features=AutomationControlled",
			`--user-agent="${userAgent}"`,
		];
		browser = await Puppeteer.launch({
			headless: false,
			args,
			ignoreDefaultArgs: [
				"--enable-automation",
				"--disable-popup-blocking",
			],
			executablePath: "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
			userDataDir: `/tmpt/ppt-${Date.now()}`,	// fresh profile for each run
		});

		let page, pageOfInterest;
		for (const url of urls) {
			console.log(`${url} …`);
			if (url.includes("rebrowser")) {
				pageOfInterest = page;
			}
			page = await browser.newPage();
			await page.setViewport(null);
			page.setDefaultNavigationTimeout(60 * 1000);
			await page.goto(url,
				{waitUntil: "domcontentloaded", timeout: 60 * 1000}
			);
		}
		await setTimeoutPromise(90 * 1000);

		// Print content of "page of interest", and all cookies

		let [content, cookies] = await Promise.all([pageOfInterest!.content(), browser.cookies()]);
		console.log({content, cookies});

	} catch (err: any) {
		throw err;
	} finally {
		try {
			await browser?.close();
		} catch (err: any) {
			console.log("ERROR: CAUGHT EXCEPTION:", err);
		}
	}
}

(() => main())();

Note that https://www.browserscan.net/bot-detection/ shows "green" on all tests, so that's great!! 🫶🏻 However, Google reCaptcha V3 Enterprise is still hit and miss.

@webhype
Copy link
Author

webhype commented Dec 15, 2024

Update: The CF 1010 error happens also when I take out all that puppeteer-extra and puppeteer-extra-plugin-stealth stuff, so that isn't the "root cause". Apparently these modules don't do anything that rebrowser-puppeteer doesn't already do. When using "naked" rebrowser-puppeteer I get 100% green passes at https://www.browserscan.net/bot-detection .

Just odd that the same production stable Chrome, when launched under rebrowser-playwright doesn't have the CloudFlare 1010 error.

@nwebson
Copy link
Contributor

nwebson commented Dec 16, 2024

CF error is really interesting one... I don't see anything wrong in your code, maybe try to comment out user agent switch?
Does exactly the code work fine with the original puppeteer?

@davyzhang
Copy link

davyzhang commented Dec 21, 2024

I can confirm that this problem is due to the ueragent, remove this line. the code works like a charm.
successfully bypassed the cloudflare human verification in my case

However the other args are needed, otherwise the naked version of rebrowser won't work

const args = [
    "--disable-web-security",
    "--no-sandbox",
    "--disable-blink-features=AutomationControlled",
    // `--user-agent="${userAgent}"`, //comment this line
];

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants