Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong encoding type in response when using "serve s3" - should not be url-encoded #7836

Open
nanawel opened this issue May 11, 2024 · 4 comments

Comments

@nanawel
Copy link

nanawel commented May 11, 2024

I'm currently testing the rclone serve s3 feature under v1.66 and it works with some clients but not some others. After digging a few hours I noticed that compared to other S3 services (here with MinIO), rclone returns the value inside <Key> (= filenames) as a URL-encoded string, while no <EncodingType> node in the response indicates this content type is in use.

While it's a bit unclear in the specs, in that case the client should expect a proper UTF-8 string and so it leads to issues when it reuses and re-URL-encode it before sending a HeadObject request for example, because it returns a 404.

The doc says Encoding type used by Amazon S3 to encode the object keys in the response. Responses are encoded only in UTF-8. so unless you pass the content-encoding=url parameter in the query, returned filenames should not be encoded that way.

Expected: <Key>Æther Realm/[2011] Odin Will Provide/cover.jpg</Key>
Actual: <Key>%C3%86ther+Realm/%5B2011%5D+Odin+Will+Provide/cover.jpg</Key>

I've tried with two different rclone backends: WebDAV and the local filesystem. The result is the same.

The test has been performed with Azuracast, which uses the official aws-sdk-php library for the heavy lifting. We can then assume this is not because of an invalid custom adapter.

@ncw
Copy link
Member

ncw commented May 11, 2024

I'm pretty sure this is a bug in our fork of rclone/gofakes3

I think that the code unconditionally uses URL encoding. It writes this in the output though.

  <EncodingType>url</EncodingType>

Whereas the code should look at the content-encoding to turn it on or off.

Can you create a matching issue here https://github.com/rclone/gofakes3/issues (link to this issue) and then I'll see if we can find someone to fix it - or maybe you'd like to have a go?

@nanawel
Copy link
Author

nanawel commented May 11, 2024

I think that the code unconditionally uses URL encoding. [...]
Whereas the code should look at the content-encoding to turn it on or off.

Well, this is how I understand the code too, but it's not what's happening. The <ListBucketResult> never has the <EncodingType> node. You can easily check that using a basic GET with curl host:8080/SomeBucket | grep EncodingType

Can you create a matching issue here

Sure. I didn't see there was a specific subproject.
There it is: rclone/gofakes3#4

or maybe you'd like to have a go?

Unfortunately I don't speak Go. ☹️

@ncw
Copy link
Member

ncw commented May 11, 2024

Thanks for making the other issue- we need both I think.

I captured this from an actual http transaction from rclone serve s3
url

So rclone does send it sometimes.

Can you capture the http request that is being generated, maybe using wireshark?

I wonder if it isn't working for v1 listings as I did a V2 listing.

@nanawel
Copy link
Author

nanawel commented May 12, 2024

I'm currently giving up on the Azuracast/rclone/S3 solution as it has unfortunately too many flaws and I'm not sure it will be stable enough in the long term.

I wonder if it isn't working for v1 listings as I did a V2 listing.

You're right sorry, my curl command was making a V1 request. When using ?list-type=2 to switch to V2, the node seems to always be present in the response.

However it seems that the proper way to work is to encode the keys only if the client asked for it with ?content-encoding=url. This is the part I'm not exactly sure I understand and I also opened a PR for what I think is also a bug in the SDK : aws/aws-sdk-php#2918

@edc-w edc-w added this to the Known Problem milestone May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants