Skip to content

Numerical Privacy Metrics should support NaN values #58

Open
@fealho

Description

@fealho

NaN values should be supported by numerical privacy metrics, but currently it raises ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

The code below reproduces this issue:

import pandas as pd
from sdmetrics.single_table.privacy import NumericalLR

data = pd.DataFrame({
    'key': [1, 2, None],
    'sensitive': [1, 2, 3]
})

privacy_metric = NumericalLR.compute(
    data,
    data, 
    key_fields=['key'],
    sensitive_fields=['sensitive']
)

print(privacy_metric) # this will print nan

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdata:single-tableRelated to tabular datasets

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions