Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for Non-ASCII filenames #117

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Lex-2008
Copy link

No description provided.

Apply decodeURI to filenames (received from server) before comparing
them to name of file we're going to upload.

Issue was that server send percent-encodes non-URL-friendly symbols,
(space " " becomes "%20"), while local files might have them.

As a result, you could upload file called "a file.txt" multiple times:
UI shows it as several files with same filename, but actually DAV
overwrites them without prompt.
Otherwise, if new filename contained non-ASCII characters, like german
"ẞß", for example, error happened in HTTP.ts:39 (in `method` function,
on a line which says `const request = new Request(url, ...`):

TypeError: Request constructor: Cannot convert value in
record<ByteString, ByteString> branch of
(sequence<sequence<ByteString>> or record<ByteString, ByteString>) to
ByteString because the character at index 53 has value 1099 which is
greater than 255.
Otherwise, a file called "a file" is downloaded as "a%20file".
@Lex-2008
Copy link
Author

Lex-2008 commented Jan 8, 2023

on the other side, I wonder if, instead of finding all places where decodeURI must be used, - maybe it's better to simply decodeURI filenames in server response to PROPFIND request?

@dom111 dom111 self-assigned this Jan 8, 2023
@dom111 dom111 self-requested a review January 8, 2023 16:56
@dom111 dom111 added the bug label Jan 8, 2023
@dom111
Copy link
Owner

dom111 commented Jan 8, 2023

Hey there, thanks again for flagging this! It'd be great to have a test-case for this (that we can test against different server implementations if required perhaps?) Do you have a specific filename/server configuration that this fails with? If we could confirm this, write a test and then fix, that'd be ideal in my opinion!

@@ -18,8 +18,11 @@ export const handleFileUpload = async (

state.setCollection(collection);

// NOTE: we can't use entry.name === encodeURI(file.name) here,
// because server and encodeURI() might use different case for
// %-encoded values, like "%D0" vs "%d0"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S: To avoid this problem we could toUpperCase()/toLowerCase() too. If you think it's valuable, but I have no problem with this approach either.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this won't work because toUpperCase()/toLowerCase() will also change case of other letters in filename. For example, if server says that it has a file called A%2dFile.txt, and we try to upload a file called A%2DFile.txt (different case of letter "D"), then toUpperCase() and toLowerCase() will turn it to A%2DFILE.TXT and a%2dfile.txt, respectively, which won't match filename on the server.

We could convert both filenames to upper case, but then we would assume that backend server case-insensitive, and uploading a file called FILE.TXT overwrites a file called file.txt. Which is actually a case only on Windows.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Another alternative could be decoding and re-encoding so that the case (should?) be consistent as it's using the browser to perform the encode both sides. For .toUpperCase() we could .replace(/%[0-9a-f]{2}/g, (s) => s.toUpperCase()) although I wonder if we'll encounter problems with values like %252d/%252D etc.

@dom111
Copy link
Owner

dom111 commented Jan 8, 2023

on the other side, I wonder if, instead of finding all places where decodeURI must be used, - maybe it's better to simply decodeURI filenames in server response to PROPFIND request?

Yeah, this makes sense. I don't know if we'd ever need the "wrong" filename. Happy for this approach too!

@Lex-2008
Copy link
Author

Hi again and sorry for the delayed response!

Thanks for suggestion regarding testcase! I now found them in tests/functional subdir, wonder if I can run them manually step-by-step.

I tried to explain filenames in commit messages, but not clear enough :) Let me try again:


First commit, Properly compare filenames before uploading:

  1. upload a file called a file.txt.
  2. Try to upload a file with the same name again.

Expected: you should be asked if you want to overwrite the file.

Currently, file is uploaded (and overwritten) without warning, and UI shows several files with same filename.


Second commit, URI-encode destination filename when renaming:

  1. Upload a file or create directory with any name, say, test.
  2. Rename it to teẞt.

Expected: it should be renamed both in UI and on server.

Currently, you get an error in console, like this:

TypeError: Request constructor: Cannot convert value in record<ByteString, ByteString> branch of (sequence<sequence> or record<ByteString, ByteString>) to ByteString because the character at index 53 has value 1099 which is greater than 255.


Third commit, Decode percent-encoded symbols in download file name

  1. Upload a file called a file.txt.
  2. Reload the page.
  3. Download it.

Expected: downloaded file is called a file.txt.

Currently, it's called a%20file.txt.


But now I also found another issue, to which I don't have a fix: Percent-encoded strings in filenames are decoded during upload:

  1. Upload a file called a%20file.txt.

Expected: uploaded file is called a%20file.txt both in UI and on the server.

Currently, it's called a file.txt both in UI and on the server.

I suspect that it's because filename is passed as-is to the server, and when server sees percent-encoded strings like %20 in URL, it decodes it to . I guess we need to apply encodeURI or encodeURIComponent to filename before sending it to server.

@Lex-2008
Copy link
Author

Note to self: also, "Download" links should %-encode square brackets [ in filenames. Otherwise, clicking the "Download" button works, but right_click-copy_link doesn't: it copies square brackets as-is, which later failed to be properly parsed if pasted into plaintext emails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants