ytdlp scrape comments and replies no longer work , whats the fixed code now? #9833

heyeanne34 · 2024-04-30T20:03:44Z

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

I'm asking a question and not reporting a bug or requesting a feature
I've looked through the README
I've verified that I have updated yt-dlp to nightly or master (update instructions)
I've searched known issues and the bugtracker for similar questions including closed ones. DO NOT post duplicates
I've read the guidelines for opening an issue

Please make sure the question is worded well enough to be understood

import json
import yt_dlp
import os
import re

def sanitize_filename(name, max_length=100):
"""
Sanitize the filename by removing disallowed characters and optionally truncating if necessary.
"""
name = re.sub(r'[\/*?:"<>|]', '', name)
name = name.replace(' ', '_')
if len(name) > max_length:
return name[:max_length]
return name

def download_comments_info_json(url: str, top_comments: str = 'all', replies: str = 'all'):
"""
Fetches comments (and replies, if specified) for the given video URL.
"""
max_comments = ['all'] if top_comments == 'all' else [top_comments, replies]
ydl_opts = {
'getcomments': True,
'skip_download': True,
'writesubtitles': True,
'writeautomaticsub': True,
'extractor_args': {
'youtube': {
'max_comments': max_comments,
'comment_sort': 'top',
}
},
}

with yt_dlp.YoutubeDL(ydl_opts) as ydl:
    info = ydl.extract_info(url, download=False)
    comments_info = {'comments': info.get('comments', [])}
    video_title = info.get('title', 'video')
    return comments_info, video_title

def main(urls_file, base_directory):
"""
Reads video URLs from a file, downloads comments and replies for each, and saves them as JSON files named after the video titles.
"""
# Ensure the base directory exists
if not os.path.exists(base_directory):
os.makedirs(base_directory)

with open(urls_file, 'r') as file:
    urls = file.readlines()

for url in urls:
    url = url.strip()
    comments_info, video_title = download_comments_info_json(url)
    
    # Sanitize and prepare the filename
    filename = sanitize_filename(video_title) + ".json"
    output_path = os.path.join(base_directory, filename)
    
    # Save comments information to a JSON file
    with open(output_path, 'wt') as fh:
        json.dump(comments_info, fh, indent=2)
    print(f"Comments and replies saved successfully as {output_path}")

Paths

urls_file = 'C:\Users\v1\Documents\sample.txt'
base_directory = 'C:\Users\v1\Documents\saampleout'

Execute the main function

main(urls_file, base_directory)

this is the result C:\Users\v1\PycharmProjects\pythonProject.venv\Scripts\python.exe C:\Users\v1\PycharmProjects\pythonProject\comme.py
[youtube] Extracting URL: https://www.youtube.com/watch?v=YNechg4MqXI
[youtube] YNechg4MqXI: Downloading webpage
[youtube] YNechg4MqXI: Downloading ios player API JSON
[youtube] YNechg4MqXI: Downloading android player API JSON
WARNING: [youtube] Skipping player responses from android clients (got player responses for video "aQvGIIdgFDM" instead of "YNechg4MqXI")
[youtube] YNechg4MqXI: Downloading player 7ee5b648
[youtube] YNechg4MqXI: Downloading m3u8 information
[info] YNechg4MqXI: Downloading subtitles: en-uYU-mmqFLq8
[youtube] Downloading comment section API JSON
[youtube] Downloading ~211 comments
[youtube] Sorting comments by newest first
[youtube] Downloading comment API JSON page 1 (0/~211)
[youtube] Downloading comment API JSON page 2 (0/~211)
[youtube] Downloading comment API JSON page 3 (0/~211)
[youtube] Downloading comment API JSON page 4 (0/~211)
[youtube] Downloading comment API JSON page 5 (0/~211)
[youtube] Downloading comment API JSON page 6 (0/~211)
[youtube] Downloading comment API JSON page 7 (0/~211)
[youtube] Extracted 0 comments
Comments and replies saved successfully as C:\Users\v1\Documents\saampleout\George_Washington_statue_draped_in_Palestinian_flag_on_DC_campus.json
[youtube] Extracting URL: https://www.youtube.com/watch?v=Ip9I9tUZU1c
[youtube] Ip9I9tUZU1c: Downloading webpage
[youtube] Ip9I9tUZU1c: Downloading ios player API JSON
[youtube] Ip9I9tUZU1c: Downloading android player API JSON
WARNING: [youtube] Skipping player responses from android clients (got player responses for video "aQvGIIdgFDM" instead of "Ip9I9tUZU1c")
[youtube] Ip9I9tUZU1c: Downloading m3u8 information
[info] Ip9I9tUZU1c: Downloading subtitles: en-uYU-mmqFLq8
[youtube] Downloading comment section API JSON
[youtube] Downloading ~802 comments
[youtube] Sorting comments by newest first
[youtube] Downloading comment API JSON page 1 (0/~802)
[youtube] Downloading comment API JSON page 2 (0/~802)
[youtube] Downloading comment API JSON page 3 (0/~802)
[youtube] Downloading comment API JSON page 4 (0/~802)
[youtube] Downloading comment API JSON page 5 (0/~802)
[youtube] Downloading comment API JSON page 6 (0/~802)
[youtube] Downloading comment API JSON page 7 (0/~802)
[youtube] Downloading comment API JSON page 8 (0/~802)
[youtube] Downloading comment API JSON page 9 (0/~802)
[youtube] Downloading comment API JSON page 10 (0/~802)
[youtube] Downloading comment API JSON page 11 (0/~802)
[youtube] Downloading comment API JSON page 12 (0/~802)
[youtube] Downloading comment API JSON page 13 (0/~802)
[youtube] Downloading comment API JSON page 14 (0/~802)
[youtube] Downloading comment API JSON page 15 (0/~802)
[youtube] Downloading comment API JSON page 16 (0/~802)
[youtube] Downloading comment API JSON page 17 (0/~802)
[youtube] Downloading comment API JSON page 18 (0/~802)
[youtube] Downloading comment API JSON page 19 (0/~802)
[youtube] Downloading comment API JSON page 20 (0/~802)
[youtube] Downloading comment API JSON page 21 (0/~802)
[youtube] Downloading comment API JSON page 22 (0/~802)
[youtube] Downloading comment API JSON page 23 (0/~802)
[youtube] Extracted 0 comments
Comments and replies saved successfully as C:\Users\v1\Documents\saampleout\Columbia_protesters_smash_windows,hang'intifada'_banner.json

Process finished with exit code 0

it extracted 0 please help whats the updated code?

Provide verbose output that clearly demonstrates the problem

Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
If using API, add 'verbose': True to YoutubeDL params instead
Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

No response

The text was updated successfully, but these errors were encountered:

heyeanne34 · 2024-04-30T20:14:00Z

here the whole code in txt,you will attach to path contains yt video links and path where the json files will be saved
code comments replies.txt

pukkandan · 2024-04-30T20:18:25Z

#9775

heyeanne34 · 2024-04-30T20:24:34Z

@pukkandan is there a new code? didnt quite understood the last reply in #9775

pukkandan · 2024-05-01T01:22:49Z

The issue is being worked on. Subscribe to the linked PR to be notified when it's merged

heyeanne34 added the question Question label Apr 30, 2024

pukkandan closed this as not planned Won't fix, can't repro, duplicate, stale Apr 30, 2024

pukkandan added the duplicate This issue or pull request already exists label Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ytdlp scrape comments and replies no longer work , whats the fixed code now? #9833

ytdlp scrape comments and replies no longer work , whats the fixed code now? #9833

heyeanne34 commented Apr 30, 2024 •

edited

heyeanne34 commented Apr 30, 2024 •

edited

pukkandan commented Apr 30, 2024

heyeanne34 commented Apr 30, 2024 •

edited

pukkandan commented May 1, 2024

ytdlp scrape comments and replies no longer work , whats the fixed code now? #9833

ytdlp scrape comments and replies no longer work , whats the fixed code now? #9833

Comments

heyeanne34 commented Apr 30, 2024 • edited

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

Checklist

Please make sure the question is worded well enough to be understood

Paths

Execute the main function

Provide verbose output that clearly demonstrates the problem

Complete Verbose Output

heyeanne34 commented Apr 30, 2024 • edited

pukkandan commented Apr 30, 2024

heyeanne34 commented Apr 30, 2024 • edited

pukkandan commented May 1, 2024

heyeanne34 commented Apr 30, 2024 •

edited

heyeanne34 commented Apr 30, 2024 •

edited

heyeanne34 commented Apr 30, 2024 •

edited