Can Zendriver Work in Headless Mode with Cloudflare? #35

afkarxyz · 2025-01-11T16:13:01Z

Is it possible to run Zendriver in headless mode for websites with Cloudflare? I've tried headless, but it failed. If I run it without headless, it works fine. Please help me fix my code if headless mode is indeed possible. Thank you.

import asyncio
import zendriver as zd
import re
import random

SPOTIFY_URLS = [
    "https://open.spotify.com/track/2plbrEY59IikOBgBGLjaoe",
    "https://open.spotify.com/track/4wJ5Qq0jBN4ajy7ouZIV1c",
    "https://open.spotify.com/track/6dOtVTDdiauQNBQEDOtlAB",
    "https://open.spotify.com/track/7uoFMmxln0GPXQ0AcCBXRq",
    "https://open.spotify.com/track/2HRqTpkrJO5ggZyyK6NPWz"
]

async def wait_for_element(page, selector, timeout=30000):
    try:
        element = await page.wait_for(selector, timeout=timeout)
        return element
    except asyncio.TimeoutError:
        raise Exception(f"Timeout waiting for element: {selector}")
    except Exception as e:
        raise Exception(f"Error finding element {selector}: {str(e)}")

async def wait_for_token(page, max_attempts=10, check_interval=0.5):
    for _ in range(max_attempts):
        requests = await page.evaluate("window.requests")
        for req in requests:
            if "api.spotifydown.com/download" in req['url']:
                token_match = re.search(r'token=(.+)$', req['url'])
                if token_match:
                    return token_match.group(1)
        await asyncio.sleep(check_interval)
    raise Exception("Token not found within timeout period")

async def fetch_token(url, delay=5):
    browser = await zd.start(headless=False)
    try:
        page = await browser.get("https://spotifydown.com/en")
        
        await page.evaluate("""
            window.requests = [];
            const originalFetch = window.fetch;
            window.fetch = function() {
                return new Promise((resolve, reject) => {
                    originalFetch.apply(this, arguments)
                        .then(response => {
                            window.requests.push({
                                url: response.url,
                                status: response.status,
                                headers: Object.fromEntries(response.headers.entries())
                            });
                            resolve(response);
                        })
                        .catch(reject);
                });
            };
        """)
        
        await asyncio.sleep(delay)
        
        print("Finding input element...")
        input_element = await wait_for_element(page, ".searchInput")
        await input_element.send_keys(url)
        
        print("Clicking submit button...")
        submit_button = await wait_for_element(page, "button.flex.justify-center.items-center.bg-button")
        await submit_button.click()
        
        print("Clicking download button...")
        download_selector = "div.flex.items-center.justify-end button.w-24.sm\\:w-32.mt-2.p-2.cursor-pointer.bg-button.rounded-full.text-gray-100.hover\\:bg-button-active"
        download_button = await wait_for_element(page, download_selector)
        await download_button.click()
        
        print("Waiting for token...")
        token = await wait_for_token(page)
        return token
                
    finally:
        await browser.stop()

async def main():
    try:
        url = random.choice(SPOTIFY_URLS)
        print(f"Using URL: {url}")
        
        token = await fetch_token(url)
        print(f"Token retrieved: {token}")
        return token
        
    except Exception as e:
        print(f"Error: {str(e)}")
        return None

if __name__ == "__main__":
    token = asyncio.run(main())

ZenulAbidin · 2025-01-15T04:50:51Z

I use zendriver on many docker hosts to scrape a lot of pages from a website with cloudflare protection, so yes, it should stay undetected in headless.

afkarxyz · 2025-01-15T07:21:59Z

I use zendriver on many docker hosts to scrape a lot of pages from a website with cloudflare protection, so yes, it should stay undetected in headless.

Sorry my question is very basic. Does the docker still need chrome installed even though it is headless?
I don't know what the concept of docker is, is it like a virtual machine?

ZenulAbidin · 2025-01-15T07:28:08Z

On Wednesday, January 15th, 2025 at 8:22 AM, afkarxyz ***@***.***> wrote: > I use zendriver on many docker hosts to scrape a lot of pages from a website with cloudflare protection, so yes, it should stay undetected in headless. Sorry my question is very basic. Does the docker still need chrome installed even though it is headless? I don't know what the concept of docker is, is it like a virtual machine? — Reply to this email directly, [view it on GitHub](#35 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AKE46J6EBVSGLMXMTA5JRQ32KYEC3AVCNFSM6AAAAABVADFF6SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRHAYTOMJVGQ). You are receiving this because you commented.Message ID: ***@***.***>

Headless just means Chrome without a GUI - so no fancy libraries like X11 or Wayland have to be installed - just Chrome itself along with fewer dependencies. You'd use zendriver to control a headless chrome in any case. Docker is basically a VM hypervisor except the 'VMs' (actually containers) use much less RAM, CPU, disk etc. than a virtual machine would use.

fbtariq · 2025-01-21T21:32:51Z

On Wednesday, January 15th, 2025 at 8:22 AM, afkarxyz @.***> wrote:

I use zendriver on many docker hosts to scrape a lot of pages from a website with cloudflare protection, so yes, it should stay undetected in headless.

Sorry my question is very basic. Does the docker still need chrome installed even though it is headless?
I don't know what the concept of docker is, is it like a virtual machine?

—
Reply to this email directly, [view it on GitHub](#35 (comment)), or unsubscribe.
You are receiving this because you commented.Message ID: @.***>
Headless just means Chrome without a GUI - so no fancy libraries like X11 or Wayland have to be installed - just Chrome itself along with fewer dependencies. You'd use zendriver to control a headless chrome in any case.

Docker is basically a VM hypervisor except the 'VMs' (actually containers) use much less RAM, CPU, disk etc. than a virtual machine would use.

Could you share the Dockerfile used to run zendriver in docker in headless mode?

stephanlensky · 2025-01-22T18:54:12Z

I have an example repository here zendriver-docker which shows how to use Docker & Zendriver with both headful (Wayland) and headless Chrome (just add headless=True to the example code, it should work just the same).

There is a lot of added complexity in the Docker image which is required in order to run Chrome in headful mode, though I find it quite helpful since it allows you to VNC into the container and actually interact with the running browser.

If you only want to run in headless mode, the image could likely be substantially simplified. I'd be happy to accept a PR in that repo to add a Dockerfile for a simplified headless image if anyone is interested 🙂

chompie · 2025-02-18T16:27:35Z

I have an example repository here zendriver-docker which shows how to use Docker & Zendriver with both headful (Wayland) and headless Chrome (just add headless=True to the example code, it should work just the same).

There is a lot of added complexity in the Docker image which is required in order to run Chrome in headful mode, though I find it quite helpful since it allows you to VNC into the container and actually interact with the running browser.

If you only want to run in headless mode, the image could likely be substantially simplified. I'd be happy to accept a PR in that repo to add a Dockerfile for a simplified headless image if anyone is interested 🙂

An image you could (also) use for AWS lambda would be great, e.g. with a default config that just does javascript rendering and returns the page content, the resulting URL (after possible redirects) and a http status code.

github-actions · 2025-03-23T02:09:43Z

This issue has been marked stale because it has been open for 30 days with no activity. If there is no activity within 7 days, it will be automatically closed.

github-actions · 2025-03-30T02:11:07Z

This issue was automatically closed because it has been inactive for 7 days since being marked as stale.

stephanlensky added the question Further information is requested label Jan 22, 2025

github-actions bot added the stale label Mar 23, 2025

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can Zendriver Work in Headless Mode with Cloudflare? #35

Can Zendriver Work in Headless Mode with Cloudflare? #35

afkarxyz commented Jan 11, 2025 •

edited

Loading

ZenulAbidin commented Jan 15, 2025

Uh oh!

afkarxyz commented Jan 15, 2025

Uh oh!

ZenulAbidin commented Jan 15, 2025 via email

Uh oh!

fbtariq commented Jan 21, 2025

Uh oh!

stephanlensky commented Jan 22, 2025 •

edited

Loading

Uh oh!

chompie commented Feb 18, 2025

Uh oh!

github-actions bot commented Mar 23, 2025

Uh oh!

github-actions bot commented Mar 30, 2025

Uh oh!

Can Zendriver Work in Headless Mode with Cloudflare? #35

Can Zendriver Work in Headless Mode with Cloudflare? #35

Comments

afkarxyz commented Jan 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ZenulAbidin commented Jan 15, 2025

Uh oh!

afkarxyz commented Jan 15, 2025

Uh oh!

ZenulAbidin commented Jan 15, 2025 via email

Uh oh!

fbtariq commented Jan 21, 2025

Uh oh!

stephanlensky commented Jan 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chompie commented Feb 18, 2025

Uh oh!

github-actions bot commented Mar 23, 2025

Uh oh!

github-actions bot commented Mar 30, 2025

Uh oh!

afkarxyz commented Jan 11, 2025 •

edited

Loading

stephanlensky commented Jan 22, 2025 •

edited

Loading