You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are common cases where categorical columns have strong dependencies which are considered public and should be preserved. For example, a table that has State and ZIP code columns should not waste privacy budget generating spurious combinations of state and ZIP code, because the mapping between State and ZIP is public knowledge. This is common with "Star Schema" dimensional designs. One simple way to handle this would be to allow the transformers to concatenate these values together before fitting, and split apart again when transforming.
This is related to #534, because combinations of public dimensions can still fingerprint individuals (and indeed, the risk is higher, because the needed attributes for the fingerprint are public). So the transformer would need to ensure a full cross join, or proper generation of spurious combinations, or alternately redaction with DPSU/DPNE. However, this could be useful.
The text was updated successfully, but these errors were encountered:
There are common cases where categorical columns have strong dependencies which are considered public and should be preserved. For example, a table that has State and ZIP code columns should not waste privacy budget generating spurious combinations of state and ZIP code, because the mapping between State and ZIP is public knowledge. This is common with "Star Schema" dimensional designs. One simple way to handle this would be to allow the transformers to concatenate these values together before fitting, and split apart again when transforming.
This is related to #534, because combinations of public dimensions can still fingerprint individuals (and indeed, the risk is higher, because the needed attributes for the fingerprint are public). So the transformer would need to ensure a full cross join, or proper generation of spurious combinations, or alternately redaction with DPSU/DPNE. However, this could be useful.
The text was updated successfully, but these errors were encountered: