Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does CSTest quantify the synthesis of missing values? #71

Open
npatki opened this issue Aug 10, 2021 · 0 comments
Open

Does CSTest quantify the synthesis of missing values? #71

npatki opened this issue Aug 10, 2021 · 0 comments
Labels
question General question about the software

Comments

@npatki
Copy link
Contributor

npatki commented Aug 10, 2021

If I have a table with some missing values, I want to synthesize data with missing values too -- ideally in the same ratio. I'm curious whether CSTest is an appropriate signal of this? If it isn't, should we modify it to be?

Details: From the API reference

This function applies the single column CSTest metric to all the discrete columns found in the table and then returns the average of all the scores obtained.

I know that the SDV internally creates a new, discrete binary column representing whether a column is null. But I don't now if this column is used in the CSTest computation because it's dropped before returning the synthetic data.

@npatki npatki transferred this issue from sdv-dev/SDV Aug 12, 2021
@npatki npatki added the question General question about the software label Aug 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question General question about the software
Projects
None yet
Development

No branches or pull requests

1 participant