Open
Description
NaN values should be supported by numerical privacy metrics, but currently it raises ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
The code below reproduces this issue:
import pandas as pd
from sdmetrics.single_table.privacy import NumericalLR
data = pd.DataFrame({
'key': [1, 2, None],
'sensitive': [1, 2, 3]
})
privacy_metric = NumericalLR.compute(
data,
data,
key_fields=['key'],
sensitive_fields=['sensitive']
)
print(privacy_metric) # this will print nan