New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 9701: NEXT_DATA field video extraction for bbc US website #9705
base: master
Are you sure you want to change the base?
Conversation
Some bbc articles with embedded video have the data for them within a json structure tagged with NEXT_DATA. Add a parser for this case. Links tested: https://www.bbc.com/news/uk-68546268 https://www.bbc.com/news/world-middle-east-68778149 https://www.bbc.com/reel/video/p07c6sb6/how-positive-thinking-is-harming-your-happiness
Transcript of successful
|
Looks like in the failing test, BBCIE and BBCCoUkIE match the same URL. We can modify the suitable() function to avoid the duplicate. |
We do not need the 'sound' matcher in the BBCCoUkIE. This creates a duplicate match that fails the duplicate test, and anyway this is a skipped test in BBCIE at the moment. If we want that change as part of a BBC improvement for UK links with 'sound', that can be included in a subsequent PR. |
Quite right. The Sounds pages have the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work from kylegustavo!
This reverts commit b0593ec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls don't mark comments as resolved. The reviewer will decide whether the conversation is resolved
Co-authored-by: dirkf <[email protected]>
Everything LGTM now. Pls make sure to test all the code branches one last time |
Checking in, what's the process of getting this change merged now? Do I need approval from @bashonly, and who has the write privileges to squash and merge this change? |
Hello, just checking in again. @pukkandan do you have write privileges? |
@kylegustavo please be patient, I would like to give this a final review, but have not found the time yet. I will do so as soon as I can. Also, pukkandan is the owner of the repo |
Ok, thanks @bashonly, take your time. Just checking to see that this review is in the pipeline and not going stale. |
IMPORTANT: PRs without the template will be CLOSED
NEXT_DATA field video extraction for BBC
Some bbc articles with an embedded video have the data for them within a json structure tagged with NEXT_DATA.
Some other websites also use this kind of structure, so we can utilize the existing helper function _search_nextjs_data
to get this information. For BBC, this is used if the article was accessed in the US. Add a parser for this case.
Also patching an old test case's title to still be correct.
Fixes #9701
Template
Before submitting a pull request make sure you have:
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check all of the following options that apply:
What is the purpose of your pull request?