-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add management command for deleting duplicate captions from YouTube #2409
base: master
Are you sure you want to change the base?
Conversation
if not items: | ||
continue | ||
|
||
newest = items[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should get the the newest item that matches either of the two concerned names.
Maybe it's because the setting (or the sample video itself) but we get an auto-translation in Russian
as well. in my testing i got the the newest item as below. Since, it did not match any of the expected values in the if-else
that followed, it did not work for me.
/Gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a management command for deleting duplicate captions from YouTube videos. The command iterates through video files associated with specified websites, identifies duplicate caption tracks (specifically 'ocw_studio_upload'), and removes them. The test suite covers scenarios where the newest track is either 'ocw_studio_upload' or 'CC (English)'.
Summary of Findings
- Error Handling: The
except HttpError
block in the management command could benefit from more specific error handling or logging to provide better insights into the nature of the errors encountered during YouTube API interactions. - Clarity of Variable Names: Consider renaming variables like
vf
to something more descriptive likevideo_file
to improve code readability. - Redundancy in Test Assertions: In the test suite, there are multiple assertions related to API calls. Consider consolidating these assertions to improve test readability and maintainability.
Assessment
The pull request introduces a management command to delete duplicate captions from YouTube videos, which addresses a specific issue related to legacy caption tracks. The code appears well-structured and includes a test suite to verify the command's functionality. Overall, the changes seem reasonable and necessary. I've provided some feedback to further improve the code quality and clarity. Please address these comments before merging, and ensure that others review and approve this code as well.
except HttpError: | ||
log.exception("Error processing video %s", video_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider logging the specific HttpError
that occurred, or handling different types of HttpError
differently. This could provide more context for debugging and monitoring. For example, a 404 error might indicate a video was deleted, while a quota error might require a different response.
except HttpError: | |
log.exception("Error processing video %s", video_id) | |
except HttpError as e: | |
log.exception("Error processing video %s: %s", video_id, e) |
mock_youtube_api.client.captions.return_value.list.assert_called_with( | ||
part="snippet", videoId="dummy_youtube_id" | ||
) | ||
mock_youtube_api.client.captions.return_value.download.assert_called_once() | ||
|
||
mock_youtube_api.client.captions.return_value.delete.assert_called_once_with( | ||
id="caption_id_legacy" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider consolidating the assertions into a single block or using assert_has_calls
to verify the sequence of calls. This can improve test readability and maintainability.
calls = [
mocker.call.client.captions.return_value.list(part="snippet", videoId="dummy_youtube_id"),
mocker.call.client.captions.return_value.download(),
mocker.call.client.captions.return_value.delete(id="caption_id_legacy")
]
mock_youtube_api.assert_has_calls(calls)
What are the relevant tickets?
Final part of https://github.com/mitodl/hq/issues/1770.
Description (What does it do?)
This PR adds a management command for deleting the legacy (duplicate) captions track from YouTube videos.
How can this be tested?
First, use RC credentials to upload a sample video to YouTube. Then, open Django shell by running
docker compose exec web ./manage.py shell
, and execute something like the following, optionally changing the captions content and/or order of caption track uploads.Then, run
upload_captions(<youtube video id>)
. There should now be two captions tracks for the YouTube video, allowing for testing the management command. Next, rundocker compose exec web ./manage.py delete_duplicate_captions_youtube --filter <website short id>
to test the management command, and verify that the legacy captions track has been deleted on YouTube.