-
-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Downloads (edited title) #161
Comments
I'm guessing this is triggering a file download, which in Chrome, will sometimes redirect to load into the browser. I got halfway through the downloads PR and then the dev who was interested in implementing it kind of disappeared. |
I noticed that I previously found a similar behavior with puppeteer, and that this puppeteer issue indicated it had |
I don't see any confirmation in the thread, but it sounds to me like this is because headless chrome triggers downloads when it encounters PDFs. Which makes sense because headless chrome has no "plugins" installed in it, and the plugins are what knows how to render PDFs. Like I said, I need to finish (or get someone's help?!?! hint, hint) the PR I linked to above. Been wrapped up in some things for the new Hero project, so I haven't been able to get to this. |
Existing PR in SecretAgentThere's a PR that was mostly completed against SecretAgent. It can be mostly applied to the Agent repo. HOWEVER.. I came away thinking that the best approach for this was actually to allow Downloads to behave like normal resources. Request InterceptionI think to achieve this, we might want "request interception" with an ability to "stream" the response body as it becomes available. |
When I had been scraping a user's requestUrl with playwright, given that some urls that do not end with the suffix '.pdf' ARE in fact pdfs, and in even rarer cases, some urls that end with '.pdf' are actually text/html, I had been using playwright to tell me if the document was a pdf.
i.e. I looked at the 'content-type' header found in the playwright page.goto() response, and made sure it is 'text/html', before doing further things with that visited document.
But when I use the agent.goto function to visit any pdf in secret-agent, I get something like the following exception:
The text was updated successfully, but these errors were encountered: