Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ASCII characters in CSP policy. #473

Closed
antosart opened this issue Mar 8, 2021 · 9 comments
Closed

Non-ASCII characters in CSP policy. #473

antosart opened this issue Mar 8, 2021 · 9 comments
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.

Comments

@antosart
Copy link
Member

antosart commented Mar 8, 2021

When a CSP policy contains non-ASCII characters, it looks like:

  • according to the spec, the whole policy should be discarded.
  • Firefox tries to parse the CSP and discards only the source expression (assuming the non-ASCII character is in a source expression).
  • Chrome and Safari try to parse the CSP and discard the whole directive containing the non-ASCII character (but not other directives).

See also https://bugs.chromium.org/p/chromium/issues/detail?id=1172497

I think we should agree on a behaviour here.

@annevk
Copy link
Member

annevk commented Mar 8, 2021

It would be nice if parsing in general was a lot tighter (see also #397). (And HTML and HTTP had their own entry points of sorts.)

Aligning on Chrome/Safari seems reasonable as a first step, but it would be nice to be more accommodating to non-ASCII, especially for domain names. Supporting non-ASCII in HTML and decoding the HTTP header values using UTF-8 wouldn't be a very complicated upgrade I think, though we'd need solid test coverage of course.

@dveditz
Copy link
Member

dveditz commented Mar 8, 2021

punycode is perfectly valid ASCII and matches the DNS records. UTF-8 is a reasonable coding, but we'd have to specify (and test) that implementations have to apply the full set of IDNA rules before doing any domain name matching.

@bakkot
Copy link

bakkot commented Mar 8, 2021

See also #6, #414, #434.

@chrishtr
Copy link

Also: https://bugzilla.mozilla.org/show_bug.cgi?id=1570722

@annevk
Copy link
Member

annevk commented Mar 23, 2021

Thanks for highlighting that, that seems to indicate ignoring altogether when faced with bytes > 0x7F might be better. As we clarify the parsing requirements here, one thing that would be good is to ensure that

Content-Security-Policy: A, B

and

Content-Security-Policy: A
Content-Security-Policy: B

parse identically, for any value of A and B. We are slowly fixing this for all HTTP headers so they all parse more robustly in the face of header combining intermediaries.

You can do this by building upon https://fetch.spec.whatwg.org/#concept-header-list-get or https://fetch.spec.whatwg.org/#concept-header-list-get-decode-split in Fetch.

@chrishtr
Copy link

Hi @annevk You suggested that aligning on Chrome/Safari behavior seems reasonable as a first step. How about we do that as a next step, then come back to the other suggestions as a followup?

I'd like to help get issue 1570722 fixed; @karlcow mentioned it was a significant source of compatibility problems between the browsers.

antosart added a commit that referenced this issue Mar 24, 2021
The algorithm for parsing Content Security Policies was accepting a serialized CSP (list), i.e. a string matching the CSP grammar, but it was being called with a string or a byte sequence as argument. The spec was not saying anything about what to do when parsing policies containing non-ASCII characters.

This change clarifies that along the lines of the discussion on #473.
github-actions bot added a commit that referenced this issue Mar 24, 2021
…e algorithm for parsing Content Security Policies was accepting a serialized CSP (list), i.e. a string matching the CSP grammar, but it was being called with a string or a byte sequence as argument. The spec was not saying anything about what to do when parsing policies containing non-ASCII characters.This change clarifies that along the lines of the discussion on #473.

SHA: 49ae457
Reason: push, by @antosart

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@seongil-wi
Copy link

Hello,

I'm curious why the spec aligns with chrome's behavior rather than firefox's behavior when Non-ASCII characters are in CSP (though it would be nicer to accept Non-ASCII characters in the first place).

It is quite reasonable for developers to include Non-ASCII characters in their CSPs. Elements that the developer wants to block securely by entering values ​​for the directive can be completely ignored by a single Non-ASCII character.
Reference:
[1] https://bugs.chromium.org/p/chromium/issues/detail?id=1172497
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=1291844

I'm not sure if chrome/safari's behavior is reasonable "in terms of the security provided by CSP". I followed up on all the other issues, but I'm not sure about this. If you have time, please elaborate on this.

Thanks!

@antosart
Copy link
Member Author

My reasoning here was that by dropping the whole directive, the page would (likely) still work. Suppose you specify default-src https://example<some-non-ASCII-char>.org. If we drop the whole directive, default-src will not block anything and your page will still work. If we drop only the source expression, then you would be left with default-src, which blocks everything.

As for accepting non-ASCII characters in the first place, I would like first to have a common infrastructure to parse HTTP headers as utf-8, and not move forward along with CSP on this.

@xfq xfq added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Oct 21, 2022
@antosart
Copy link
Member Author

I would close this, since #486 clarified the behaviour in the spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response.
Projects
None yet
Development

No branches or pull requests

7 participants