Add optional caching of CI test values in constraint-based causal discovery #122

adam2392 · 2023-03-16T16:40:10Z

Similar to causal-learn and in the quest to achieve feature parity to make sure we're converging to a best-of-both implementations, we want to add aching of CI test values as a base for all skeleton learning algorithms. Starting an issue to track doing this...

We want to cache the explicit pvalues to allow users to re-run the entire algorithm using a set of different alpha values. Moreover, if they want to re-run the algorithm, it would be trivial to do so.

Implementation Thoughts

Caching can be implemented as a function of joblib. We want caching to be a function of the dataset, so we would first compute a hash of the dataset, which is used as a folder location: location of cache = '.dodiscover/<dataset_hash>'. So the cache would save to a private folder (similar to what many packages do) and then we can easily clear the cache using joblib API.

Then as a function of x_var, y_var, conditioning_set, and conditioning_test, we would let joblib.Memory cache the pvalues for us. However, another problem we need to figure out is how to best parallelize the existing CI tests using joblib.Parallel. This is actually not super trivial when I looked at it.

Assuming we can implement the parallelization, then the joblib caching would come for free almost and they would work well together without us having to right any of the "file saving and file opening" code. This is all abstracted.

xref: https://joblib.readthedocs.io/en/latest/auto_examples/nested_parallel_memory.html#sphx-glr-auto-examples-nested-parallel-memory-py

cc: @jaron-lee

The text was updated successfully, but these errors were encountered:

adam2392 added the good first issue Good for newcomers label Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional caching of CI test values in constraint-based causal discovery #122

Add optional caching of CI test values in constraint-based causal discovery #122

adam2392 commented Mar 16, 2023 •

edited

Loading

Add optional caching of CI test values in constraint-based causal discovery #122

Add optional caching of CI test values in constraint-based causal discovery #122

Comments

adam2392 commented Mar 16, 2023 • edited Loading

Implementation Thoughts

adam2392 commented Mar 16, 2023 •

edited

Loading