Skip to content

Feat: support video chapters #388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

- Add support for video chapters (#170)

## [3.3.0] - 2024-12-05

### Changed
Expand Down
10 changes: 10 additions & 0 deletions scraper/src/youtube2zim/schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,14 @@
name: str


class Chapter(CamelModel):

Check warning on line 29 in scraper/src/youtube2zim/schemas.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/schemas.py#L29

Added line #L29 was not covered by tests
"""Class to serialize data about YouTube Video chapter"""

start_time: float | int
end_time: float | int
title: str

Check warning on line 34 in scraper/src/youtube2zim/schemas.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/schemas.py#L32-L34

Added lines #L32 - L34 were not covered by tests


class Subtitles(CamelModel):
"""Class to serialize data about a list of YouTube video subtitles."""

Expand All @@ -44,6 +52,8 @@
thumbnail_path: str | None = None
subtitle_path: str | None = None
subtitle_list: list[Subtitle]
chapters_path: str | None = None
chapter_list: list[Chapter]

Check warning on line 56 in scraper/src/youtube2zim/schemas.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/schemas.py#L55-L56

Added lines #L55 - L56 were not covered by tests
duration: str


Expand Down
82 changes: 80 additions & 2 deletions scraper/src/youtube2zim/scraper.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
from youtube2zim.schemas import (
Author,
Channel,
Chapter,
Config,
HomePlaylists,
Playlist,
Expand Down Expand Up @@ -215,6 +216,10 @@
def subtitles_cache_dir(self):
return self.cache_dir.joinpath("subtitles")

@property

Check warning on line 219 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L219

Added line #L219 was not covered by tests
def chapters_cache_dir(self):
return self.cache_dir.joinpath("chapters")

Check warning on line 221 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L221

Added line #L221 was not covered by tests

@property
def videos_dir(self):
return self.build_dir.joinpath("videos")
Expand Down Expand Up @@ -455,6 +460,7 @@
# cache folder to store youtube-api results
self.cache_dir.mkdir(exist_ok=True)
self.subtitles_cache_dir.mkdir(exist_ok=True)
self.chapters_cache_dir.mkdir(exist_ok=True)

Check warning on line 463 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L463

Added line #L463 was not covered by tests

# make videos placeholder
self.videos_dir.mkdir(exist_ok=True)
Expand Down Expand Up @@ -805,14 +811,72 @@
self.upload_to_cache(s3_key, thumbnail_path, preset.VERSION)
return True

def add_chapters_to_zim(self, video_id: str):

Check warning on line 814 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L814

Added line #L814 was not covered by tests
"""add chapters file to zim file"""

chapters_file = self.videos_dir.joinpath(video_id, "chapters.vtt")

Check warning on line 817 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L817

Added line #L817 was not covered by tests
if chapters_file.exists():
self.add_file_to_zim(

Check warning on line 819 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L819

Added line #L819 was not covered by tests
f"videos/{video_id}/{chapters_file.name}",
chapters_file,
callback=(delete_callback, chapters_file),
)

def generate_chapters_vtt(self, video_id):

Check warning on line 825 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L825

Added line #L825 was not covered by tests
"""generate the chapters file of a video if chapters available"""

metadata_file = self.videos_dir.joinpath(video_id, "video.info.json")

Check warning on line 828 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L828

Added line #L828 was not covered by tests
if metadata_file.exists():
with open(metadata_file, encoding="utf-8") as f:
metadata = json.load(f)
chapters = metadata.get("chapters", [])

Check warning on line 832 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L831-L832

Added lines #L831 - L832 were not covered by tests

if not chapters:
logger.info(f"No chapters found for {video_id}")
return

Check warning on line 836 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L835-L836

Added lines #L835 - L836 were not covered by tests

logger.info(f"Found {len(chapters)} chapters for {video_id}")

Check warning on line 838 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L838

Added line #L838 was not covered by tests

save_json(

Check warning on line 840 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L840

Added line #L840 was not covered by tests
self.chapters_cache_dir,
video_id,
{"chapters": chapters},
)

chapters_file = self.videos_dir.joinpath(video_id, "chapters.vtt")

Check warning on line 846 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L846

Added line #L846 was not covered by tests
with chapters_file.open("w", encoding="utf8") as chapter_f:
chapter_f.write("WEBVTT\n\n")

Check warning on line 848 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L848

Added line #L848 was not covered by tests
for chapter in chapters:
start = chapter["start_time"]
end = chapter["end_time"]
title = chapter["title"]

Check warning on line 852 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L850-L852

Added lines #L850 - L852 were not covered by tests

start_time = (

Check warning on line 854 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L854

Added line #L854 was not covered by tests
f"{int(start//3600):02}:"
f"{int((start%3600)//60):02}:"
f"{int(start%60):02}."
f"{int((start%1)*1000):03}"
)
end_time = (

Check warning on line 860 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L860

Added line #L860 was not covered by tests
f"{int(end//3600):02}:"
f"{int((end%3600)//60):02}:"
f"{int(end%60):02}."
f"{int((end%1)*1000):03}"
)

chapter_f.write(f"{start_time} --> {end_time}\n")
chapter_f.write(f"{title}\n\n")
logger.info(f"Chapters file saved for {video_id}")
self.add_chapters_to_zim(video_id)

Check warning on line 870 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L867-L870

Added lines #L867 - L870 were not covered by tests

def fetch_video_subtitles_list(self, video_id: str) -> Subtitles:
"""fetch list of subtitles for a video"""

video_dir = self.videos_dir.joinpath(video_id)
languages = [
x.stem.split(".")[1]
for x in video_dir.iterdir()
if x.is_file() and x.name.endswith(".vtt")
if x.is_file() and x.name.endswith(".vtt") and x.name != "chapters.vtt"
]

def to_subtitle_object(lang) -> Subtitle:
Expand Down Expand Up @@ -855,7 +919,9 @@
"""download subtitles for a video"""

options_copy = options.copy()
options_copy.update({"skip_download": True, "writethumbnail": False})
options_copy.update(

Check warning on line 922 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L922

Added line #L922 was not covered by tests
{"skip_download": True, "writethumbnail": False, "writeinfojson": True}
)
try:
with yt_dlp.YoutubeDL(options_copy) as ydl:
ydl.download([video_id])
Expand Down Expand Up @@ -883,6 +949,7 @@
video_id, options
):
self.download_subtitles(video_id, options)
self.generate_chapters_vtt(video_id)

Check warning on line 952 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L952

Added line #L952 was not covered by tests
succeeded.append(video_id)
else:
failed.append(video_id)
Expand Down Expand Up @@ -1010,6 +1077,12 @@
return []
return subtitles_list["subtitles"]

def get_chapters(video_id) -> list[Chapter]:
chapters_list = load_json(self.chapters_cache_dir, video_id)

Check warning on line 1081 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L1080-L1081

Added lines #L1080 - L1081 were not covered by tests
if chapters_list is None:
return []
return chapters_list["chapters"]

Check warning on line 1084 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L1083-L1084

Added lines #L1083 - L1084 were not covered by tests

def get_videos_list(playlist):
videos = load_mandatory_json(
self.cache_dir, f"playlist_{playlist.playlist_id}_videos"
Expand All @@ -1025,6 +1098,7 @@
author = videos_channels[video_id]
subtitles_list = get_subtitles(video_id)
channel_data = get_channel_json(author["channelId"])
chapters_list = get_chapters(video_id)

Check warning on line 1101 in scraper/src/youtube2zim/scraper.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/youtube2zim/scraper.py#L1101

Added line #L1101 was not covered by tests

return Video(
id=video_id,
Expand All @@ -1043,6 +1117,10 @@
thumbnail_path=get_thumbnail_path(video_id),
subtitle_path=f"videos/{video_id}" if len(subtitles_list) > 0 else None,
subtitle_list=subtitles_list,
chapters_path=(
f"videos/{video_id}" if len(chapters_list) > 0 else None
),
chapter_list=chapters_list,
duration=videos_channels[video_id]["duration"],
)

Expand Down
18 changes: 16 additions & 2 deletions zimui/src/assets/vjs-youtube.css
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@
border-radius: 8px;
}


.video-js.vjs-fluid,
.video-js.vjs-16-9,
.video-js.vjs-4-3,
Expand Down Expand Up @@ -88,4 +87,19 @@ video.vjs-tech {
height: 100% !important;
max-height: 100vh;
object-fit: contain;
}
}

.custom-marker {
position: absolute;
bottom: 0;
width: 5px;
height: 100%;
background-color: #aaa;
cursor: pointer;
}

.vjs-time-tooltip {
transform: translateY(-80%) !important;
line-height: 1.5;
padding: 3px 6px !important;
}
Loading
Loading