Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[networking] Add response extensions for impersonate info #9756

Merged
merged 11 commits into from May 4, 2024

Conversation

bashonly
Copy link
Member

@bashonly bashonly commented Apr 21, 2024

this can be useful during extractor error handling, e.g. in the crunchyroll extractor, where we want to expect the 403 error and a offer a cloudflare bypass hint if impersonation was not used, or else raise w/ a bug report message if impersonation was used.

Template

Before submitting a pull request make sure you have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:

  • I am the original author of this code and I am willing to release it under Unlicense

What is the purpose of your pull request?

  • Core bug fix/improvement

@bashonly bashonly added enhancement New feature or request networking core networking related labels Apr 21, 2024
@bashonly bashonly requested a review from coletdjnz April 21, 2024 18:17
@pukkandan
Copy link
Member

Pls explain motivation for 3b48105. I liked the first impl much better

@bashonly
Copy link
Member Author

@pukkandan

hypothetically, the value of impersonate can be None:

class ImpersonateRequestHandler(RequestHandler, ABC):
"""
Base class for request handlers that support browser impersonation.
This provides a method for checking the validity of the impersonate extension,
which can be used in _check_extensions.
Impersonate targets consist of a client, version, os and os_ver.
See the ImpersonateTarget class for more details.
The following may be defined:
- `_SUPPORTED_IMPERSONATE_TARGET_MAP`: a dict mapping supported targets to custom object.
Any Request with an impersonate target not in this list will raise an UnsupportedRequest.
Set to None to disable this check.
Note: Entries are in order of preference
Parameters:
@param impersonate: the default impersonate target to use for requests.
Set to None to disable impersonation.
"""

so if a response is an instance of ImpersonateResponse, then that is completely meaningless on its own. In order to determine if impersonation was used for a request, the extractor would also have to do check truthiness of the impersonate attribute, with the code looking like this:

from ..networking.impersonate import ImpersonateResponse

try:
    ...
except ExtractorError as e:
    if isinstance(e.cause, HTTPError) and e.cause.status == 403:
        if isinstance(e.cause.response, ImpersonateResponse) and e.cause.response.impersonate is not None:
            target = str(e.cause.response.impersonate)
            # handling for impersonation
        # handling for no impersonation

vs. with new impl:

try:
    ...
except ExtractorError as e:
    if isinstance(e.cause, HTTPError) and e.cause.status == 403:
        if target := e.cause.response.extras.get('impersonate'):
            # handling for impersonation
        # handling for no impersonation

@pukkandan
Copy link
Member

pukkandan commented Apr 23, 2024

Why not target := getattr(e.cause.response, 'impersonate', None)?

Alternatively, we could put an impersonate = None in normal Response if we wanna be able to check e.cause.response.impersonate safely

Authored by: bashonly
@bashonly
Copy link
Member Author

Why not target := getattr(e.cause.response, 'impersonate', None)?

That should suffice, yeah. I've reverted to the orig impl (but kept the corrected type annotations)

Alternatively, we could put an impersonate = None in normal Response if we wanna be able to check e.cause.response.impersonate safely

IMO the base Response shouldn't concern itself with the extensions of its subclasses

Authored by: bashonly
Authored by: bashonly
Authored by: bashonly
@bashonly bashonly changed the title [networking] Add ImpersonateResponse [networking] Add response extensions for impersonate info Apr 23, 2024
yt_dlp/networking/common.py Outdated Show resolved Hide resolved
Authored by: bashonly
Authored by: bashonly
@bashonly bashonly merged commit bec9a59 into yt-dlp:master May 4, 2024
15 checks passed
@bashonly bashonly deleted the feat/impersonate-response branch May 10, 2024 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request networking core networking related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants