You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Often, potentially sensitive (in a sense that it could identify the user) information can be present in URLs. Some of these issues can be resolved using canonical meta tags, but that mostly applies to when the information is present in the query string (example: example.com/editprofile?id=1234 => example.com/editprofile).
However, there are cases where information that is part of the path can be considered sensitive, but there is no canonical URL to reduce these too, as they do point to different content. For example example.com/profiles/[id] might reveal information about users activity if the URL is only accessible after logging in and the ids are specific to the user.
Approach: Anonymizing data in the Auditorium
Operators define rules for anonymizing data in the Auditorium (most likely using a pattern / regexp based approach like:
/profiles/{user_id:[a-z0-9]*} is defined as an anonymization rule
Pageviews on /profiles/alice and /profiles/bob will be collapsed and aggregated into /profiles/user_id
Rules are stored on the server. On load the Auditorium fetches the ruleset and applies it to the query results before displaying these.
Pros
Easy to implement technically
Rules can be changed at any time and will be applied retroactively, i.e. leaks can be fixed after they have been discovered
Cons
Pattern language for defining rules might be complicated for some operators. (How are bad patterns handled?)
Data is only being anonymized at display time, i.e. if going great lengths, operators can undo the application of these rules.
Approach: Anonymizing data on collection
Operators define rules for anonymizing data when deploying the Offen script, either using a data attribute or a JavaScript global, e.g.:
The script reads these rules and sanitizes data before they are being encrypted.
Pros
Data is never being stored in the non-anonymized form, privacy is preserved perfectly
Cons
Deployment is relatively complicated for operators, there is no graphical UI that can give you immediate feedback (the Auditorium could contain a "tester" applet though, or a offen rule subcommand is added to the command)
Rules cannot be applied retroactively, data that has been altered by a rule once cannot be changed anymore.
The text was updated successfully, but these errors were encountered:
Often, potentially sensitive (in a sense that it could identify the user) information can be present in URLs. Some of these issues can be resolved using canonical meta tags, but that mostly applies to when the information is present in the query string (example:
example.com/editprofile?id=1234
=>example.com/editprofile
).However, there are cases where information that is part of the path can be considered sensitive, but there is no canonical URL to reduce these too, as they do point to different content. For example
example.com/profiles/[id]
might reveal information about users activity if the URL is only accessible after logging in and the ids are specific to the user.Approach: Anonymizing data in the Auditorium
Operators define rules for anonymizing data in the Auditorium (most likely using a pattern / regexp based approach like:
/profiles/{user_id:[a-z0-9]*}
is defined as an anonymization rule/profiles/alice
and/profiles/bob
will be collapsed and aggregated into/profiles/user_id
Rules are stored on the server. On load the Auditorium fetches the ruleset and applies it to the query results before displaying these.
Pros
Cons
Approach: Anonymizing data on collection
Operators define rules for anonymizing data when deploying the Offen script, either using a
data
attribute or a JavaScript global, e.g.:The script reads these rules and sanitizes data before they are being encrypted.
Pros
Cons
offen rule
subcommand is added to the command)The text was updated successfully, but these errors were encountered: