Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial sections on security and privacy considerations #99

Merged
merged 4 commits into from
Oct 14, 2022

Conversation

fsteeg
Copy link
Member

@fsteeg fsteeg commented Sep 6, 2022

@fsteeg fsteeg requested a review from wetneb September 6, 2022 14:42
<h3>Other considerations</h3>
<ul>
<li>Reconciliation services and clients MAY support user authentication (see <a href="#authentication">authentication</a>).</li>
<li>Reconciliation features don't expose information about the underlying platform to origins, they don't allow origins to send data to the underlying platform, access other devices, device sensors, a user agent's native UI,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "origin" do you mean "client"? That would read more natural to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's confusing. In this case, I think it's both client and server for the first two instances and server for the third instance (see here & here for some background). I'll turn this into a draft for now and revisit it later. Maybe we could add an explanation of origin, or I'll reword to use client and server only. Thanks for the review!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the confusing 'origin' terminology, added another security consideration that came to mind while rewording (which already shows the benefit of clearer language here!), and linked to some existing definitions for consistency. Would be great if you could take another look!

@fsteeg fsteeg marked this pull request as draft September 13, 2022 07:45
@fsteeg fsteeg self-assigned this Sep 13, 2022
@fsteeg fsteeg marked this pull request as ready for review October 11, 2022 14:47
@fsteeg fsteeg assigned wetneb and unassigned fsteeg Oct 11, 2022
@fsteeg fsteeg requested a review from wetneb October 11, 2022 14:47
<li>Reconciliation services SHOULD guard against malicious code when querying their underlying data store with data from <a>reconciliation queries</a> to avoid code [[injection]].</li>
<li>Reconciliation services SHOULD NOT parse the JSON requests sent by clients using a code execution mechanism like JavaScript's <code>eval()</code>, since requests could contain malicious code.</li>
<li>Reconciliation clients that display data from <a href="#preview-service">preview services</a> SHOULD ensure that the data is encoded, sanitized, or isolated to prevent [[XSS]] and [[CSRF]] attacks.</li>
<li>Reconciliation clients SHOULD provide a configuration option to disallow calls to localhost, with the default set to disallow, as a malicious service could cause the client to make arbitrary POST requests to itself.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand this attack fully and I feel like I should, because it might apply to OpenRefine. Is it worth elaborating in the specs or do you want to explain that separately?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also stopped there yesterday and had to think about it for a bit. So maybe it'd be good to elaborate in the specs, if we agree it makes sense and applies to us. I've got this from the ActivityPub and Webmention specs, which both mention it. Maybe we could adapt and combine the explanations from these specs like this to make it clearer:

"If a reconciliation client has any services that listen on localhost and don't require authentication, it's possible for a malicious reconciliation service to provide URLs in its manifest that could cause the client to make arbitrary POST requests to itself. Therefore, if your reconciliation client does permit making requests to localhost URLs for development purposes, it SHOULD be a configuration option which defaults to off."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, which of the URLs provided in the manifest are used for POST requests? The suggest services use GET, so does the preview service, property proposal and so on… The only POST requests are made to the reconciliation endpoint itself (but that is not a URL that the manifest is able to define).

I am not saying this is by design (it is probably rather by luck), but as far as I can tell we are safe here. And it should only get better with #91 since manifests can no longer point to arbitrary URLs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, both the actual query and the data extension POSTs are made to the service endpoint, and the URLs in the manifest are only used for GET requests. Removed the localhost consideration in a062854.

<ul>
<li>Reconciliation clients MAY introduce new state that persists across browsing sessions for persisting <a>reconciliation queries</a>,
the service's <a>reconciliation results</a>, and/or the user's selection of matches.</li>
<li>Reconciliation clients running in a browser’s <em>Private Browsing</em> or <em>Incognito</em> mode SHOULD NOT store <a>reconciliation queries</a>, the service's <a>reconciliation results</a>, or the user's selection of matches.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I never thought about that.
Is Private browsing/Incognito mode actually meant to be detected and acted upon by websites? My understanding was that it was just implemented client side, with the client discarding cookies, localStorage and other things like that.

This is different from the Do Not Track header, which is specifically there to be announced by the client and acted upon by the server, but this feature seems to be deprecated.

Given that detecting Private browsing/Incognito mode seems to be quite fiddly, I am not sure we should assume that clients do that (for instance, to prevent their own backend from saving information).

Could it be that you got this idea from W3C recommendations which are primarily designed for new DOM/Javascript APIs? For those, it definitely makes sense to consider which state the browser is supposed to keep. But because I do not think we expect people to implement a reconciliation client natively in a browser (meaning, in a web rendering engine), it feels less fitting here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detecting Private browsing/Incognito mode seems to be quite fiddly [...] Could it be that you got this idea from W3C recommendations which are primarily designed for new DOM/Javascript APIs?

Ah, interesting. Yes I got this from the W3C questionnaire, and I would have expected private mode to be a normal thing to check via browser API, but that's actually considered leaking of private mode (though it still seems to happen). And I guess it makes sense to consider private mode a purely first-party thing, which is a matter of the browser only, and no business of any site visited.

Somewhere in between I guess are browser plugins, where checking for private mode seems to be possible. And I guess there it can still provide privacy too: if only the browser part of the plugin handles that information and never sends it to the service. So maybe we could clarify that section by adding something like "For reconciliation clients running as web applications, this is typically handled by the browser, and data will be stored on the web application's server side only."

It's a bit tricky with OpenRefine being a reconciliation client, running as a client-server application, with the client part running in the browser, and the server part often also running on the client machine. But I think with the added wording above we cover considering the issue in general (which for browser plugin clients might be important), and also that in clients like OpenRefine, the client's server part does actually store the data, even if the client part is running in private mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reconciliation clients running as web applications, this is typically handled by the browser, and data will be stored on the web application's server side only.

I am happy with this formulation, but what about reconciliation clients not running as web applications? For desktop software or mobile apps, there is no notion of private browsing, right?

So it seems like:

  • for clients which are web applications (and can therefore be run in private browsing), they do not need to worry about this since the private browsing is handled by the browser
  • for clients which are not web applications, the notion of private browsing does not even apply in the first place

-> so perhaps we don't need that paragraph at all? Or did I miss something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed in today's meeting, the remaining thing are browser extensions, which seem to be disabled in private mode by default, but can be enabled both in Firefox and Chrome. I've made the private mode consideration specific to extensions in 9b27f21.

URLs defined in the manifest are only used for GET requests
For regular web clients, the browser will handle this
@fsteeg fsteeg merged commit 90b5148 into master Oct 14, 2022
@fsteeg fsteeg deleted the security-privacy branch October 14, 2022 09:02
fsteeg added a commit that referenced this pull request Jul 13, 2023
URLs defined in the manifest are only used for GET requests
fsteeg added a commit that referenced this pull request Jul 13, 2023
For regular web clients, the browser will handle this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants