-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Images requiring a Referer header are not fetched #172
Comments
Do you happen to have an example command for a page whose images have hotlink prevention? There may be something we can do about it, if we fetch them like the browser would, by respecting the Referrer Policy. As for content generated with JavaScript, percollate does not run the original webpage in Puppeteer (Chromium), so you must fetch the page externally. For example, monolith suggests using chromium --headless --incognito --dump-dom https://github.com | monolith - -I -b https://github.com -o github.html You could do something similar with Percollate: chromium --headless --incognito --dump-dom https://github.com | percollate --url https://github.com - |
Example of missing images1:
Perhaps the percollate can be used to add custom request headers. Example of missing images2:
I can see the content, but still missing images |
Thanks for the test case. It seems that PDF generation has the same issue with images, because of the lack of As for the chromium --headless --incognito --dump-dom https://github.com | percollate --url https://github.com - Notice the |
* Set 'referrer' and 'referrerPolicy' when fetching inline images, re: #172 * Also send referrer when fetching images for EPUB.
Released a fix as part of |
Environment
node --version
: v20.12.2npm --version
: 10.7.0percollate --version
:v4.1.1Description
Hello!
When using percollate epub to generate EPUB files, I sometimes notice missing images or text. I found that some websites require the Referer parameter to be set for images to prevent hotlinking; otherwise, the images show a 403 error when downloaded. The missing text issue is due to the content being dynamically generated by JavaScript. Do you have any good solutions for these two situations? Thank you.
The text was updated successfully, but these errors were encountered: