Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a metric to evaluate anonymization #8

Open
zuberek opened this issue Aug 2, 2019 · 2 comments
Open

Add a metric to evaluate anonymization #8

zuberek opened this issue Aug 2, 2019 · 2 comments
Labels
feature request Request for a new feature

Comments

@zuberek
Copy link

zuberek commented Aug 2, 2019

Description

Having an easy way of measuring the privacy of synthesized data would be very useful for users of the tool. It could be added on top of the existing evaluation metrics sdv-dev/SDV#52 .

An easy way to measure it would be to calculate average euclidean distance to the closest neighbour between real and synthetic data. It was used in TableGAN paper. However that would apply only to numerical data which in case of SDV sometimes is not enough.

It could also be implemnted in a way that the user can specify which fields he wants to use in the evaluation, for example some more sensitive fields should be taken into account while others can be ignored.

@csala
Copy link
Contributor

csala commented Jul 2, 2020

Transferring this to SDMetrics

@csala csala transferred this issue from sdv-dev/SDV Jul 2, 2020
@poornima-sivanand
Copy link

This feature would be extremely useful. Can anyone please share if this has been added or point me to resources that show privacy assessment of data generated by CTGAN, DeepEcho and Copulas?

@npatki npatki added the feature request Request for a new feature label Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

4 participants