Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support mixed types in Privacy Metrics #134

Open
npatki opened this issue Nov 23, 2021 · 0 comments
Open

Support mixed types in Privacy Metrics #134

npatki opened this issue Nov 23, 2021 · 0 comments
Labels
feature request Request for a new feature

Comments

@npatki
Copy link
Contributor

npatki commented Nov 23, 2021

Problem Description

The Privacy Metrics assume an adversarial attack model where a user with access to a few key_fields might be able to predict sensitive_fields.

I understand that we need to fit different models based on whether the sensitive_fields are categorical vs. numeric. However, it is expected that all the key_fields are also of the same type. Does this need to be the case? What if I think some categorical columns might be crucial in leaking numeric data (and vice versa)?

Expected behavior

Depending on the type of the sensitive_fields, it would be nice to convert the input columns so that they are compatible with the tests.

  1. If the sensitive_fields are numeric, then we can convert categorical key_fields to numeric similar to how we do it in KSTestExtended
  2. If the sensitive_fields are categorical, then it may be possible to bin the key_fields

Additional context

  • What should the user API be? It would be ideal to guide the user into making a choice (to drop the columns or convert them)
  • Should we be converting the columns ourselves or should we expect users to do this first (eg. using a transformer)?
@npatki npatki added feature request Request for a new feature documentation Docs, user guides or API metrics and removed documentation Docs, user guides or API labels Nov 23, 2021
@npatki npatki transferred this issue from sdv-dev/SDV Jun 10, 2022
@npatki npatki removed the metrics label Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

1 participant