Skip to content

Add tools for testing and make an example with CPI #265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 79 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
2524418
remove not necessary function for testing
lionelkusch May 23, 2025
3688c9c
add a check on the number of featires for X
lionelkusch May 23, 2025
ee9cf84
add assertion on the model
lionelkusch May 23, 2025
be4170c
improve the check_fit
lionelkusch May 23, 2025
2f2779c
update data_generation
lionelkusch May 23, 2025
3a228b2
add function for the generation of data for the tests
lionelkusch May 23, 2025
5348e89
improve test for CPI
lionelkusch May 23, 2025
bf7383f
fix tests
lionelkusch May 23, 2025
6aa3f41
improve geenration of model
lionelkusch May 23, 2025
dfcc77c
change multivariate function but need to fix the function reid
lionelkusch May 23, 2025
7fc28dc
change the generation of data
lionelkusch May 23, 2025
74d8462
Add TODO
lionelkusch May 23, 2025
0ce7094
add tests for senario
lionelkusch May 23, 2025
297e26d
fix tests
lionelkusch May 26, 2025
b105d5d
Update src/hidimstat/_utils/scenario.py
lionelkusch May 26, 2025
03d41fb
Fix docstring in the PR
lionelkusch May 26, 2025
463d35b
fix noise_mag
lionelkusch Jun 6, 2025
055bede
improve test of knockoff
lionelkusch Jun 6, 2025
d0df863
small improvement
lionelkusch Jun 6, 2025
062817c
fix estimation of variance
lionelkusch Jun 6, 2025
a6dbabd
clean noise_std
lionelkusch Jun 10, 2025
59ea869
change name of the sigma
lionelkusch Jun 11, 2025
daab920
add the possibility of continous support
lionelkusch Jun 11, 2025
c46ee47
dfix test by increasing the snr and find a good seed
lionelkusch Jun 11, 2025
e78fb42
fix test for noise_std
lionelkusch Jun 11, 2025
a67015d
fix tests
lionelkusch Jun 11, 2025
acaeb78
fix tests
lionelkusch Jun 11, 2025
b73fadb
fix errror
lionelkusch Jun 11, 2025
7fa9153
fix dcrt example
lionelkusch Jun 11, 2025
8e0dcd3
fix bug in exampel by modification of example
lionelkusch Jun 11, 2025
166983e
fix size of tests
lionelkusch Jun 11, 2025
6b80ca3
Merge branch 'main' into PR_test_cpi
lionelkusch Jun 11, 2025
da36c6e
Apply suggestions from code review
lionelkusch Jun 11, 2025
1d179d7
fix time and seed for test
lionelkusch Jun 11, 2025
3ba565f
Fix number of threadpool to 1
lionelkusch Jun 11, 2025
2100e4e
increase coverage
lionelkusch Jun 11, 2025
7f2ab38
Merge branch 'main' into PR_test_cpi
lionelkusch Jun 11, 2025
8faba22
Merge branch 'main' into PR_test_cpi
lionelkusch Jun 11, 2025
447be48
Merge branch 'main' into PR_test_cpi
lionelkusch Jun 11, 2025
d618c46
Merge branch 'main' into PR_test_cpi
lionelkusch Jun 11, 2025
a9c2dc1
Merge branch 'main' into PR_test_cpi
lionelkusch Jun 12, 2025
8cd762f
Merge branch 'main' into PR_test_cpi
lionelkusch Jun 12, 2025
aa5fc39
add a warning
lionelkusch Jun 12, 2025
64805e5
add test for warning
lionelkusch Jun 12, 2025
3afe488
Modify snr for better approch
lionelkusch Jun 13, 2025
d268310
Apply suggestions from code review
lionelkusch Jun 16, 2025
8cf3a80
Update src/hidimstat/_utils/scenario.py
lionelkusch Jun 16, 2025
1b75430
modify senteces of the tests
lionelkusch Jun 16, 2025
ddc2ba7
fix situation of 0 support
lionelkusch Jun 16, 2025
c1b2c0a
change name of noise including spatial information
lionelkusch Jun 16, 2025
0e4dfc5
change mane of the noise fonction
lionelkusch Jun 16, 2025
695626f
Change parameter continous
lionelkusch Jun 16, 2025
847e598
fix range of paraemter for rho_noise_time
lionelkusch Jun 16, 2025
db8d671
fix message of the error
lionelkusch Jun 16, 2025
27ac293
fix message of error
lionelkusch Jun 16, 2025
9c3ee05
fix test
lionelkusch Jun 16, 2025
04aed38
fix error in name parameters
lionelkusch Jun 16, 2025
f6c0e99
fix name in the error
lionelkusch Jun 16, 2025
bb5a236
fix tests
lionelkusch Jun 16, 2025
64ff9ad
fix generation of data parameters
lionelkusch Jun 16, 2025
cd73f6f
fix the missmathc of feature
lionelkusch Jun 16, 2025
319d02c
fix error when the support of noise was zero
lionelkusch Jun 16, 2025
eee2700
fix bug in the assertion
lionelkusch Jun 16, 2025
624c040
fix assertion
lionelkusch Jun 16, 2025
0f43f07
Merge branch 'main' into PR_test_cpi
lionelkusch Jun 16, 2025
a3d6e9a
Update src/hidimstat/_utils/scenario.py
lionelkusch Jun 18, 2025
476ea57
transform beta in boolean array
lionelkusch Jun 18, 2025
3219c33
rename noise serial
lionelkusch Jun 18, 2025
2cd2ecf
Improve comment of shuffle
lionelkusch Jun 18, 2025
2bb005e
done
lionelkusch Jun 18, 2025
1a2039b
remove assertion on number of jobs
lionelkusch Jun 18, 2025
1bbb3b8
Improve docstring
lionelkusch Jun 18, 2025
9c16ab6
fix bug in tests
lionelkusch Jun 18, 2025
fb18bd0
fix tests cpi
lionelkusch Jun 18, 2025
dae8f8d
Remove unessesary tests
lionelkusch Jun 18, 2025
158a293
fix test noise_std
lionelkusch Jun 18, 2025
2b39b65
replace n_times by n_targets
lionelkusch Jun 19, 2025
8006811
fix change name
lionelkusch Jun 19, 2025
27cbb7d
Merge branch 'main' into PR_test_cpi
lionelkusch Jun 20, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions examples/plot_2D_simulation_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@
ensemble_clustered_inference_pvalue,
)
from hidimstat.statistical_tools.p_values import zscore_from_pval
from hidimstat._utils.scenario import multivariate_simulation
from hidimstat._utils.scenario import multivariate_simulation_spatial

#############################################################################
# Specific plotting functions
Expand Down Expand Up @@ -171,7 +171,7 @@ def plot(maps, titles):
smooth_X = 1.0 # level of spatial smoothing introduced by the Gaussian filter

# generating the data
X_init, y, beta, epsilon, _, _ = multivariate_simulation(
X_init, y, beta, epsilon, _, _ = multivariate_simulation_spatial(
n_samples, shape, roi_size, sigma, smooth_X, seed=1
)

Expand Down
24 changes: 17 additions & 7 deletions examples/plot_dcrt_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
import matplotlib.pyplot as plt
import numpy as np
from hidimstat.dcrt import dcrt_zero, dcrt_pvalue
from hidimstat._utils.scenario import multivariate_1D_simulation
from hidimstat._utils.scenario import multivariate_simulation

plt.rcParams.update({"font.size": 21})

Expand Down Expand Up @@ -43,8 +43,14 @@
# Nominal false positive rate
alpha = 5e-2

X, y, _, __ = multivariate_1D_simulation(
n_samples=n, n_features=p, support_size=n_signal, rho=rho, seed=sim_ind
X, y, beta_true, _, _, _ = multivariate_simulation(
n_samples=n,
n_features=p,
support_size=n_signal,
rho=rho,
snr=snr,
shuffle=True,
seed=sim_ind,
)

# Applying a reLu function on the outcome y to get non-linear relationships
Expand All @@ -55,8 +61,10 @@
variables_important_lasso, pvals_lasso, ts_lasso = dcrt_pvalue(
selection_features, X_res, sigma2, y_res
)
typeI_error["Lasso"].append(sum(pvals_lasso[n_signal:] < alpha) / (p - n_signal))
power["Lasso"].append(sum(pvals_lasso[:n_signal] < alpha) / (n_signal))
typeI_error["Lasso"].append(
sum(pvals_lasso[np.logical_not(beta_true)] < alpha) / (p - n_signal)
)
power["Lasso"].append(sum(pvals_lasso[beta_true] < alpha) / (n_signal))

## dcrt Random Forest ##
selection_features, X_res, sigma2, y_res = dcrt_zero(
Expand All @@ -65,8 +73,10 @@
rvariables_important_forest, pvals_forest, ts_forest = dcrt_pvalue(
selection_features, X_res, sigma2, y_res
)
typeI_error["Forest"].append(sum(pvals_forest[n_signal:] < alpha) / (p - n_signal))
power["Forest"].append(sum(pvals_forest[:n_signal] < alpha) / (n_signal))
typeI_error["Forest"].append(
sum(pvals_forest[np.logical_not(beta_true)] < alpha) / (p - n_signal)
)
power["Forest"].append(sum(pvals_forest[beta_true] < alpha) / (n_signal))

#############################################################################
# Plotting the comparison
Expand Down
2 changes: 1 addition & 1 deletion examples/plot_importance_classification_iris.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ def run_one_fold(X, y, model, train_index, test_index, vim_name="CPI", groups=No
GridSearchCV(SVC(kernel="rbf"), {"C": np.logspace(-3, 3, 10)}),
]
cv = KFold(n_splits=5, shuffle=True, random_state=0)
groups = {ft: i for i, ft in enumerate(dataset.feature_names)}
groups = {ft: [i] for i, ft in enumerate(dataset.feature_names)}
out_list = Parallel(n_jobs=5)(
delayed(run_one_fold)(
X, y, model, train_index, test_index, vim_name=vim_name, groups=groups
Expand Down
12 changes: 9 additions & 3 deletions examples/plot_knockoff_aggregation.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
model_x_knockoff_pvalue,
)
from hidimstat.statistical_tools.multiple_testing import fdp_power
from hidimstat._utils.scenario import multivariate_1D_simulation_AR
from hidimstat._utils.scenario import multivariate_simulation


#############################################################################
Expand Down Expand Up @@ -75,9 +75,15 @@
# ---------------------------------------------------------------------
def single_run(n_samples, n_features, rho, sparsity, snr, fdr, n_bootstraps, seed=None):
# Generate data
X, y, _, non_zero_index = multivariate_1D_simulation_AR(
n_samples, n_features, rho=rho, sparsity=sparsity, seed=seed, snr=snr
X, y, beta_true, _, _, _ = multivariate_simulation(
n_samples,
n_features,
rho=rho,
support_size=int(n_features * sparsity),
snr=snr,
seed=seed,
)
non_zero_index = np.where(beta_true)[0]

# Use model-X Knockoffs [1]
selected, test_scores, threshold, X_tildes = model_x_knockoff(
Expand Down
32 changes: 17 additions & 15 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,8 @@ doc = [
"sphinx-prompt >= 1.0.0",
"seaborn",
]
plotting = [
"matplotlib>=3.9.0",
]
style = [
"black >= 24.4.2",
"isort >= 5.13.2"
]
plotting = ["matplotlib>=3.9.0"]
style = ["black >= 24.4.2", "isort >= 5.13.2"]
# For running unit and docstring tests
test = [
"iniconfig >= 0.1, < 3",
Expand All @@ -67,7 +62,8 @@ test = [
"pytest-xdist[psutil]",
"pytest-html",
"pytest-timeout",
"pytest-durations"
"pytest-durations",
"pytest-env",
]

[project.urls]
Expand Down Expand Up @@ -96,20 +92,26 @@ minversion = "8.0"
pythonpath = "src"
testpaths = "test"
addopts = [
"-ra", # short test summary info
"-ra", # short test summary info
"--import-mode=importlib", # better control of importing packages
"--showlocals", # show local variable inn trackbacks
"--strict-config", # no warning from parsing pytest configuration file
"--strict-markers", # undefine markers will raise an error
"--showlocals", # show local variable inn trackbacks
"--strict-config", # no warning from parsing pytest configuration file
"--strict-markers", # undefine markers will raise an error
# pytest-randomly
# "--randomly-dont-reset-seed", # turn off the rest of the random seed
# pytest-xdist option
"-n=auto", # automatically define the number of process
"-n=auto", # automatically define the number of process
]
markers = [
"slow: marks tests as slow (deselect with '-m \"not slow\"')",
]
# pytest-timeout
timeout= 60 # on individual test should not take more than 10s
session_timeout=1200 # all the tests should be run in 5 min
timeout = 60 # on individual test should not take more than 10s
session_timeout = 1200 # all the tests should be run in 5 min


[tool.pytest_env]
OPENBLAS_NUM_THREADS = 1
BLIS_NUM_THREADS = 1
MKL_NUM_THREADS = 1
OMP_NUM_THREADS = 1
Loading