Release Automatic Stattest Selection · evidentlyai/evidently

Release scope:

For small data with <= 1000 observations in the reference dataset:

For numerical features (n_unique > 5): two-sample Kolmogorov-Smirnov test.
For categorical features or numerical features with n_unique <= 5: chi-squared test.
For binary categorical features (n_unique <= 2), we use the proportion difference test for independent samples based on Z-score.
All tests use a 0.95 confidence level by default.

For larger data with > 1000 observations in the reference dataset:

For numerical features (n_unique > 5): Wasserstein Distance.
For categorical features or numerical with n_unique <= 5): Jensen–Shannon divergence.
All tests use a threshold = 0.1 by default.

Added options for setting custom statistical test for Categorical and Numerical Target Drift Dashboard/Profile:
cat_target_stattest_func: Defines a custom statistical test to detect target drift in CatTargetDrift.
num_target_stattest_func: Defines a custom statistical test to detect target drift in NumTargetDrift.
Added options for setting custom threshold for drift detection for Categorical and Numerical Target Drift Dashboard/Profile:
cat_target_threshold: Optional[float] = None
num_target_threshold: Optional[float] = None
These thresholds highly depends on selected stattest, generally it is either threshold for p_value or threshold for a distance.

Fixes:
#207

Provide feedback