You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As mentioned in #70, the current implementation of CSTest might not be entirely correct. Before applying CSTest, we currently normalize the frequencies of each category but in reality, scipy expects larger number (it mentions they should be at least 5).
Expected behavior
I'm filing this feature request to track the logic and utility of CSTest. If this is something we want to continue supporting, how can we change it to make it more accurate?
Additional context
Note that in #142, we are adding the TVComplement metric, a categorical similarity metric based on the Total Variation Distance. So even if we deprecate CSTest, there will still be another measure for categorical similarity.
The text was updated successfully, but these errors were encountered:
Also noticed this behaviour when getting very high p-values even for synthetic data that did not seem to fit the real very well.
Furthermore, the scipy.stats.chisquare method is a one-way test and expects the number of observations to match:
"the sum of the observed and expected frequencies must be the same for the test to be valid"...
Hi @MartinKratky, yes the frequencies must match and scipy recommends there be >13 data points. Currently, SDMetrics normalizes the data to 1, which is invalid for the test -- so I would not rely on the p-value that is currently returned.
Why do you plan to deprecate CSTest?
We have not yet made a final decision on this. However, we have added TVComplement metric if you'd like to play around with it. This calculates similarity using the Total Variation Distance.
Most synthetic data users are not interested in hypothesis testing (deciding whether the synthetic and real data are different); rather, they are interested in quantifying the difference. We find that a distance metric, not a p-value, is more suitable for this goal.
Problem Description
As mentioned in #70, the current implementation of
CSTest
might not be entirely correct. Before applyingCSTest
, we currently normalize the frequencies of each category but in reality,scipy
expects larger number (it mentions they should be at least 5).Expected behavior
I'm filing this feature request to track the logic and utility of
CSTest
. If this is something we want to continue supporting, how can we change it to make it more accurate?Additional context
Note that in #142, we are adding the
TVComplement
metric, a categorical similarity metric based on the Total Variation Distance. So even if we deprecateCSTest
, there will still be another measure for categorical similarity.The text was updated successfully, but these errors were encountered: