Add a metric to evaluate anonymization #8

zuberek · 2019-08-02T10:28:58Z

Description

Having an easy way of measuring the privacy of synthesized data would be very useful for users of the tool. It could be added on top of the existing evaluation metrics sdv-dev/SDV#52 .

An easy way to measure it would be to calculate average euclidean distance to the closest neighbour between real and synthetic data. It was used in TableGAN paper. However that would apply only to numerical data which in case of SDV sometimes is not enough.

It could also be implemnted in a way that the user can specify which fields he wants to use in the evaluation, for example some more sensitive fields should be taken into account while others can be ignored.

csala · 2020-07-02T09:56:47Z

Transferring this to SDMetrics

poornima-sivanand · 2021-10-12T21:14:47Z

This feature would be extremely useful. Can anyone please share if this has been added or point me to resources that show privacy assessment of data generated by CTGAN, DeepEcho and Copulas?

csala transferred this issue from sdv-dev/SDV Jul 2, 2020

npatki added the feature request Request for a new feature label Jul 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a metric to evaluate anonymization #8

Add a metric to evaluate anonymization #8

zuberek commented Aug 2, 2019 •

edited

Loading

csala commented Jul 2, 2020

poornima-sivanand commented Oct 12, 2021

Add a metric to evaluate anonymization #8

Add a metric to evaluate anonymization #8

Comments

zuberek commented Aug 2, 2019 • edited Loading

Description

csala commented Jul 2, 2020

poornima-sivanand commented Oct 12, 2021

zuberek commented Aug 2, 2019 •

edited

Loading