We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This private tracker torrent file has a file path which includes an unicode character that's being incorrectly parsed
\x008D chr(189) Vulgar Fraction One Half
\x008D chr(189)
I noticed it because after loading the file with the Torrent class, the calculated info_hash was different from the original torrent.
Screenshots of original torrent file and a new one created with Torrent.to_file from the same data in the hex editor
Torrent.to_file
Original:
Created with Torrent class
Torrent
When using the Bencode class to read and write the torrent, the char is correctly parsed and the hashes match.
Bencode
Here's a version of the original torrent without the tracker url
431f76f60e05250df162c90a73ab8377dc4ca9c8.zip
screenshot of the terminal output when reading the file with Torrent class (the file name is the correct sha1 hash)
The text was updated successfully, but these errors were encountered:
EF BF BD means that filename contains non-utf symbol, we've tried and parsed as utf-8. What's the encoding used in your filesystem for filenames?
EF BF BD
Sorry, something went wrong.
I'm on Windows 11, which uses unicode to encode file paths, if I understood correctly.
I think this specific torrent used latin-1 encoding for the file paths, so I guess this is very much a corner case
Hm, latin-1... This comment seems to be relevant #2 (comment)
latin-1
No branches or pull requests
This private tracker torrent file has a file path which includes an unicode character that's being incorrectly parsed
\x008D chr(189)
Vulgar Fraction One HalfI noticed it because after loading the file with the Torrent class, the calculated info_hash was different from the original torrent.
Screenshots of original torrent file and a new one created with
Torrent.to_file
from the same data in the hex editorOriginal:
data:image/s3,"s3://crabby-images/d79cc/d79cc7406baa75c3fafd80cba7c723017d84fe26" alt="Screenshot 2023-09-26 142545"
Created with
data:image/s3,"s3://crabby-images/be4b4/be4b4525b8161ffedc9ac908917d9a9ec726842a" alt="Screenshot 2023-09-26 142612"
Torrent
classWhen using the
Bencode
class to read and write the torrent, the char is correctly parsed and the hashes match.Here's a version of the original torrent without the tracker url
431f76f60e05250df162c90a73ab8377dc4ca9c8.zip
screenshot of the terminal output when reading the file with
data:image/s3,"s3://crabby-images/75a88/75a88f8d0efef3a93d99e8c9fe98e3e5e85a159f" alt="Screenshot 2023-09-26 151205"
Torrent
class (the file name is the correct sha1 hash)The text was updated successfully, but these errors were encountered: