Skip to content

How to grab the current page's URL? #85

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
steve1316 opened this issue Feb 27, 2025 · 4 comments
Closed

How to grab the current page's URL? #85

steve1316 opened this issue Feb 27, 2025 · 4 comments
Labels
question Further information is requested

Comments

@steve1316
Copy link

Description

  • Pretty much the title.
async def main():
    browser = await zd.start(headless=True)
    page = await browser.get("https://www.browserscan.net/bot-detection")
    # current_url = await page.get_url()

The above is just an example but there may be use cases during web scraping where you need to peek at the page's current URL for certain information. Is there a get_url() or something equivalent for zendriver?

@steve1316
Copy link
Author

steve1316 commented Feb 28, 2025

async def main():
    browser = await zd.start(headless=True)
    page = await browser.get("https://www.browserscan.net/bot-detection")
    print(await page.evaluate("window.location.href", await_promise=True))
https://www.browserscan.net/bot-detection

I guess using evaluate for window.location.href is an option but I wonder if this harms zendriver from staying undetectable.

@Avejack
Copy link

Avejack commented Mar 12, 2025

You get the current URL by doing:

current_url = page.target.url

@stephanlensky stephanlensky added the question Further information is requested label Apr 21, 2025
@stephanlensky
Copy link
Owner

@Avejack is correct, I'll mark this as completed for now.

BTW, I don't think using page.evaluate like that should harm the undetectability (but, using page.target.url is still much easier).

@Sejmou
Copy link

Sejmou commented May 28, 2025

Should page.evaluate("window.location.href", await_promise=True) always produce the same output as page.target.url?

I encountered an instance where this wasn't the case. It happened when the browser got redirected because a verification code was requested. page.target.url still returned the previous url the browser visited while page.evaluate(...) returned the actual URL that was also displayed in the browser's address bar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants