Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ixigua ERROR: 'utf-8' codec can't decode byte in position 0: invalid continuation/start byte #9784

Open
10 of 11 tasks
yuyoiuyoiu opened this issue Apr 25, 2024 · 5 comments
Open
10 of 11 tasks
Labels
site-bug Issue with a specific website triage Untriaged issue

Comments

@yuyoiuyoiu
Copy link

yuyoiuyoiu commented Apr 25, 2024

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

Region

china

Provide a description that is worded well enough to be understood

cannot download ixigua videos

ERROR: 'utf-8' codec can't decode byte 0xd0 in position 1: invalid continuation byte

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

yt-dlp.exe -vU -i -o "Z:\%(upload_date)s  %(title)s.%(ext)s" --ignore-config --hls-prefer-native --add-metadata --merge-output-format mp4 --cookies-from-browser firefox --referer https://www.ixigua.com/ https://www.ixigua.com/7205984803529851429

[debug] Command-line config: ['-vU', '-i', '-o', 'Z:\\%(upload_date)s  %(title)s.%(ext)s', '--ignore-config', '--hls-prefer-native', '--add-metadata', '--merge-output-format', 'mp4', '--cookies-from-browser', 'firefox', '--referer', 'https://www.ixigua.com/', 'https://www.ixigua.com/7205984803529851429']
[debug] Encodings: locale cp936, fs utf-8, pref cp936, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version [email protected] from yt-dlp/yt-dlp-master-builds [ff38a011d] (win_exe)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.19045-SP0 (OpenSSL 1.1.1k  25 Mar 2021)
[debug] exe versions: ffmpeg 6.1.1-full_build-www.gyan.dev (setts), ffprobe 6.1.1-full_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.20.0, brotli-1.1.0, certifi-2024.02.02, curl_cffi-0.5.10, mutagen-1.47.0, requests-2.31.0, sqlite3-3.35.5, urllib3-2.2.1, websockets-12.0
[debug] Proxy map: {'http': 'http://127.0.0.1:7890', 'https': 'http://127.0.0.1:7890', 'ftp': 'http://127.0.0.1:7890'}
Extracting cookies from firefox
[debug] Extracting cookies from: "C:\Users\Mechrevo\AppData\Roaming\Mozilla\Firefox\Profiles\vah3au8q.default-release\cookies.sqlite"
Extracted 930 cookies from firefox
[debug] Request Handlers: urllib, requests, websockets, curl_cffi
[debug] Loaded 1810 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp-master-builds/releases/latest
Latest version: [email protected] from yt-dlp/yt-dlp-master-builds
yt-dlp is up to date ([email protected] from yt-dlp/yt-dlp-master-builds)
[Ixigua] Extracting URL: https://www.ixigua.com/7205984803529851429
[Ixigua] 720598480352985142: Downloading webpage
ERROR: 'utf-8' codec can't decode byte 0xe4 in position 0: invalid continuation byte
Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 1606, in wrapper
  File "yt_dlp\YoutubeDL.py", line 1741, in __extract_info
  File "yt_dlp\extractor\common.py", line 734, in extract
  File "yt_dlp\extractor\ixigua.py", line 69, in _real_extract
  File "yt_dlp\extractor\ixigua.py", line 53, in _media_selector
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 0: invalid continuation byte
@yuyoiuyoiu yuyoiuyoiu added site-bug Issue with a specific website triage Untriaged issue labels Apr 25, 2024
@wwwsec
Copy link

wwwsec commented Apr 30, 2024

I also encountered the same problem.

yt-dlp -vU "https://www.ixigua.com/7362315298386477620?logTag=a06efde5924b709e0c36" --referer https://www.ixigua.com/ --cookies-from-browser chrome

[debug] Command-line config: ['-vU', 'https://www.ixigua.com/7362315298386477620?logTag=a06efde5924b709e0c36', '--referer', 'https://www.ixigua.com/', '--cookies-from-browser', 'chrome']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version [email protected] from yt-dlp/yt-dlp [ff07792] (pip)
[debug] Python 3.12.0 (CPython x86_64 64bit) - macOS-10.16-x86_64-i386-64bit (OpenSSL 3.0.13 30 Jan 2024)
[debug] exe versions: ffmpeg 7.0 (setts), ffprobe 7.0
[debug] Optional libraries: Cryptodome-3.20.0, brotli-1.1.0, certifi-2024.02.02, mutagen-1.47.0, requests-2.31.0, sqlite3-3.41.2, urllib3-2.2.1, websockets-12.0
[debug] Proxy map: {}
Extracting cookies from chrome
[debug] Extracting cookies from: "/Users/yuecl/Library/Application Support/Google/Chrome/Default/Cookies"
[debug] using find-generic-password to obtain password from OSX keychain
Extracted 2044 cookies from chrome
[debug] cookie version breakdown: {'v10': 2109, 'other': 0, 'unencrypted': 5}
[debug] Request Handlers: urllib, requests, websockets
[debug] Loaded 1810 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: [email protected] from yt-dlp/yt-dlp
yt-dlp is up to date ([email protected] from yt-dlp/yt-dlp)
[Ixigua] Extracting URL: https://www.ixigua.com/7362315298386477620?logTag=a06efde5924b709e0c36
[Ixigua] 7362315298386477620: Downloading webpage
ERROR: 'utf-8' codec can't decode byte 0xaf in position 1: invalid start byte
Traceback (most recent call last):
File "/opt/anaconda3/envs/py3.12/lib/python3.12/site-packages/yt_dlp/YoutubeDL.py", line 1606, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/py3.12/lib/python3.12/site-packages/yt_dlp/YoutubeDL.py", line 1741, in __extract_info
ie_result = ie.extract(url)
^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/py3.12/lib/python3.12/site-packages/yt_dlp/extractor/common.py", line 734, in extract
ie_result = self._real_extract(url)
^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/py3.12/lib/python3.12/site-packages/yt_dlp/extractor/ixigua.py", line 70, in _real_extract
formats = list(self._media_selector(json_data.get('videoResource')))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/py3.12/lib/python3.12/site-packages/yt_dlp/extractor/ixigua.py", line 54, in _media_selector
'url': base64.b64decode(media['main_url']).decode(),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xaf in position 1: invalid start byte

@dirkf
Copy link
Contributor

dirkf commented May 13, 2024

The second problem URL is giving 404 now, with {exception404: {value: true}} as the _SSR_HYDRATED_DATA.

With the original problem URL and that from #9915, I find this:

The first media['main_url'] for the problem URL is tQRwVYaSp5CCeLzt5lflIHJfNcXjjFoUzsaVIG12nzJNB+Wm3fcF3iA1zUrqXYFS83rSkJ44h2wph6Cwd0G0Eq4+aLAX2xLMbYxH0DSTPfrtp4wyuW9EKPcjM5xttyvJsmYdXwvST9D7uSiv/5m+YSWfCt+MIh89BhDPE/MeqheJ1agxFo3+tg4E6T1Ckw2tjXyirvypUaVABAQl0sxxzWeRoJDf0SyaA7igZ/HIZllp/NfJBEBdPTujlWTDqxcZqOzxaM0vC6mUiBiFkPhztjDLuGemz1W2CTlbrMPnEDvGW7NKj+FF09lVDFPqln++dfX9OFK/Q+w37dviII5onMoso/D074eqOYNsUGMkP7qCItQl+5EYAHYjFn+oGTaI4H8Szu/yFBiAujVfdzVXtBJ2gkYKCrf8Nf07rnT6suPUNIq56rsph3FPhkWxgzhplgAhaPuJWnYxKtENoQwoc9wBDcF5ad37W68sCzmEH+bIOfe/fKID1tw+Mp6N5crMZP10iS+zIVfV9v7ozETKkU30MmJ9MjoEbUGLO0FFYvUQLvyEcJgWbFi6RZ7bPG3RkZ6LLviub7rP559y9K49F7/2TrT0IrD7J9Oz4TQ4t6H0tuQ8ZSdBdy4GlTyVmF+BABsXiZWW9VA6Sfx4wJ2U9Nr6C+7sTjEFuGv424Wxx5ASYhLie4ESQ3Tsx920CT2NoBgnYOooAF6xNVNrds9q89agjtVQikhI3RP5j36F4nWV3QiQS7j6rEwpVLID0ZXXzg43JXhbjrhFowCvP65l6jmo1bXp2AOZ6Zem5xIaIauHawk1x3+yu3qWhCha6aOeQKu7Pl+1PiEptO9C7cKFXISqdEuk3ACvlsmMHhSjpUb3xkW6c00Ulnh2HmVUsygT.

For #9915, the equivalent is OHE88w1UNY4AsuDA9xYbHXvinfi7oLrHOQqoGDdgWBRS9sU2aEaZiJrrtX1kULNXTQtOltRV7PASVXq5A+Yk58PNg+hGm0NIM1BoxqvFBnLtFkDiZYSudm0S0eVD+2mR8n/AVzrFtKLf5DYTEB/V9RfvAxhfw6tQ7gB7b++jpk6wujmbhKLFWT3UcIh7HUUl2r2+AnWP79hRF7QGez40rIPMink9I5/dLXED2sPCJmqkDeV+U8Ji31JeibU5BmewfEqJJTChi3e7JAzFBkpPgQU95WIyuT8qYk5n7ykrkyB5uB0Wawkql86RZEx4L1LoZhKBOcHfK/s5+twRhl/xQ4w+pAjifYzgY36S+d33syJoJFc0nb344WSWJH2N5UYOrYd4S04/kv7ZxqB7MhzdnzyXlTK6m4JquLMJ9N21zZh0hVbCmqrNspxc/MFCiL+jyCttj8IRyGB15WmnHllMSLeN8XCQU/uUTd9xMtTtiSmJ+OEc7g2G0aZvfO+a2N3Bo3Qea6Ad//WWNHeGStWK59xsocz2viSus60SMlxIT3qOva/caTawO6f2b7gyarA5vMMC5A6cB0+uvFXoRxYbxCInz6qYBewJ86pP3KN8Axsdotv8uCV30l3JoJ/PoXTnN5sL5b7P4dD/fM43+ijSCfM81XgRWdBoV5gyg9lhCyEv+jZWDIUw8d/+Z6SEHYnqG4byijXDMZgkU+6KWDU3xeZ6FqU0sCAeugo6XE5WaC5qqgEmSJ9yK+K/lxerBkb2akm3m0aYP1b5ZfF0eXJcyoOycemShkbYldpYxykNppM=.

@Holmes-pengge
Copy link

root@gh-cs-01:/home/lipeng/ixigua# yt-dlp -vU --referer 'https://www.ixigua.com/6996881461559165471?wid_try=1' --cookies ixigua_cookies.txt --download-archive archive.txt -f "ba/b" --ignore-errors 'https://www.ixigua.com/6996881461559165471?wid_try=1'
[debug] Command-line config: ['-vU', '--referer', 'https://www.ixigua.com/6996881461559165471?wid_try=1', '--cookies', 'ixigua_cookies.txt', '--download-archive', 'archive.txt', '-f', 'ba/b', '--ignore-errors', 'https://www.ixigua.com/6996881461559165471?wid_try=1']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version [email protected] from yt-dlp/yt-dlp [ff07792] (zip)
[debug] Python 3.10.12 (CPython x86_64 64bit) - Linux-6.5.0-27-generic-x86_64-with-glibc2.35 (OpenSSL 3.0.2 15 Mar 2022, glibc 2.35)
[debug] exe versions: ffmpeg 4.4.2 (setts), ffprobe 4.4.2
[debug] Optional libraries: certifi-2020.06.20, requests-2.31.0, secretstorage-3.3.1, sqlite3-3.37.2, urllib3-2.2.1, websockets-11.0.3
[debug] Proxy map: {}
[debug] Request Handlers: urllib, requests
[debug] Loaded 1810 extractors
[debug] Loading archive file 'archive.txt'
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: [email protected] from yt-dlp/yt-dlp
yt-dlp is up to date ([email protected] from yt-dlp/yt-dlp)
[Ixigua] Extracting URL: https://www.ixigua.com/6996881461559165471?wid_try=1
[Ixigua] 6996881461559165471: Downloading webpage
ERROR: 'utf-8' codec can't decode byte 0x94 in position 0: invalid start byte
Traceback (most recent call last):
File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 1606, in wrapper
return func(self, *args, **kwargs)
File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 1741, in __extract_info
ie_result = ie.extract(url)
File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 734, in extract
ie_result = self._real_extract(url)
File "/usr/local/bin/yt-dlp/yt_dlp/extractor/ixigua.py", line 69, in _real_extract
formats = list(self._media_selector(json_data.get('videoResource')))
File "/usr/local/bin/yt-dlp/yt_dlp/extractor/ixigua.py", line 53, in _media_selector
'url': base64.b64decode(media['main_url']).decode(),
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x94 in position 0: invalid start byte

@Holmes-pengge
Copy link

root@gh-cs-01:/home/lipeng/ixigua# yt-dlp -vU --referer 'https://www.ixigua.com/6996881461559165471?wid_try=1' --cookies ixigua_cookies.txt --download-archive archive.txt -f "ba/b" --ignore-errors 'https://www.ixigua.com/6996881461559165471?wid_try=1' [debug] Command-line config: ['-vU', '--referer', 'https://www.ixigua.com/6996881461559165471?wid_try=1', '--cookies', 'ixigua_cookies.txt', '--download-archive', 'archive.txt', '-f', 'ba/b', '--ignore-errors', 'https://www.ixigua.com/6996881461559165471?wid_try=1'] [debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8 [debug] yt-dlp version [email protected] from yt-dlp/yt-dlp [ff07792] (zip) [debug] Python 3.10.12 (CPython x86_64 64bit) - Linux-6.5.0-27-generic-x86_64-with-glibc2.35 (OpenSSL 3.0.2 15 Mar 2022, glibc 2.35) [debug] exe versions: ffmpeg 4.4.2 (setts), ffprobe 4.4.2 [debug] Optional libraries: certifi-2020.06.20, requests-2.31.0, secretstorage-3.3.1, sqlite3-3.37.2, urllib3-2.2.1, websockets-11.0.3 [debug] Proxy map: {} [debug] Request Handlers: urllib, requests [debug] Loaded 1810 extractors [debug] Loading archive file 'archive.txt' [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest Latest version: [email protected] from yt-dlp/yt-dlp yt-dlp is up to date ([email protected] from yt-dlp/yt-dlp) [Ixigua] Extracting URL: https://www.ixigua.com/6996881461559165471?wid_try=1 [Ixigua] 6996881461559165471: Downloading webpage ERROR: 'utf-8' codec can't decode byte 0x94 in position 0: invalid start byte Traceback (most recent call last): File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 1606, in wrapper return func(self, *args, **kwargs) File "/usr/local/bin/yt-dlp/yt_dlp/YoutubeDL.py", line 1741, in __extract_info ie_result = ie.extract(url) File "/usr/local/bin/yt-dlp/yt_dlp/extractor/common.py", line 734, in extract ie_result = self._real_extract(url) File "/usr/local/bin/yt-dlp/yt_dlp/extractor/ixigua.py", line 69, in _real_extract formats = list(self._media_selector(json_data.get('videoResource'))) File "/usr/local/bin/yt-dlp/yt_dlp/extractor/ixigua.py", line 53, in _media_selector 'url': base64.b64decode(media['main_url']).decode(), UnicodeDecodeError: 'utf-8' codec can't decode byte 0x94 in position 0: invalid start byte

我按照上面说的加上了 (1) sending the cookie ttwid=1;max-age=86400 (2) adding query param wid_try=1 (3) sending a Referer with the same URL

image

@Holmes-pengge
Copy link

Holmes-pengge commented May 14, 2024

请问下有谁已经解决了这个问题,可以提供一点思路给我吗?
Does anyone have solved this problem, can you provide me with a little idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website triage Untriaged issue
Projects
None yet
Development

No branches or pull requests

4 participants