Skip to content

Numerical data passed to a categorical privacy metric should raise an error #59

Open
@fealho

Description

@fealho

And vice-versa. Currently if the wrong datatype is passed it will simply return nan. It should raise an error instead.

Below is code to reproduce this phenomena:

import pandas as pd
from sdmetrics.single_table.privacy import CategoricalCAP


data = pd.DataFrame({   # data containing only numerical values
    'key': [1.4, 10.12, 3.4],
    'sensitive': [10.9, 9.8, 8.8]
})

score = CategoricalCAP.compute(  # privacy metric that's supposed to only work with categorical values
    data,
    data, 
    key_fields=['key'],
    sensitive_fields=['sensitive']
)

print(score) # this will print `nan`

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions