All notable changes to this project will be documented in this file.
- (mid) Add missing
binary
constructor toMidline
model. Now all models have abinary
andtrinary
constructor.
- Add rules to ruff.
- Make suite testable with pytest.
- Switch to pytest for testing.
1.2.2 - 2024-06-25
- (mid) Correct contra state dist evo. Fixes #85.
Previously, the model did not correctly marginalize over the possible time when a tumor can grow over the midline. It simply assumed that it did from the onset.
- (uni) Remove outdated docstring paragraph. Fixes #88.
- Bump pre-commit versions.
- Use ruff to fix lint and format code.
- Remove upper cap in dependencies because of this.
risk()
meth requiresinvolvement
. Fixes #87.
We figured it does not make sense to allow passinginvolvement=None
into therisk()
method just to have it return 1. This is except for the midline class, whereinvolvement
may reasonably beNone
whilemidext
isn't.
Also, I ran ruff over some files, fixing some code style issues.
1.2.1 - 2024-05-28
- (uni)
load_patient_data
should acceptNone
. - (mid) Correct type hint of
marginalize
. - (graph) Wrong dict when trinary.
Theto_dict()
method returned a wrong graph dictionary when trinary due to growth edges. This is fixed now. - Skip
marginalize
only when safe.
The marginalization should only be skipped (and 1 returned), when the entire disease state of interest isNone
. In the midline case, this disease state includes the midline extension.
Previously, only the involvement pattern was checked. Now, the model is more careful about when to take shortcuts.
- (graph) Modify mermaid graph.
Theget_mermaid()
andget_mermaid_url()
methods now accept arguments that allow some modifications of the output. - (uni) Add
__repr__()
.
- (uni) Use pandas
map
instead ofapply
.
This saves us a couple of lines in theload_patient_data
method and is more readable.
- Branch 'main' into 'dev'.
- Remains of callbacks.
Some callback functionality that was tested in a pre-release has been forgotten in the code base and is now deleted.
1.2.0 - 2024-03-29
- (mid)
obs_dist
may return 3D array.
- Fix unknown version in title.
- Add missing blank before list.
- (mid) Add comment about midext marginalizing.
- (mid) Add
posterior_state_dist()
method.
TheMidline
model now has aposterior_state_dist()
method, too. - (types) Base
Model
has state dist methods.
Bothstate_dist()
andposterior_state_dist()
have been added to thetypes.Model
base class. - Add
marginalize()
method.
With this new method, one can marginalize a (prior or posterior) state distribution over all states that match a provided involvement.
It is used e.g. to refactor the code of therisk()
methods. - (types) Add
obs_dist
andmarginalize
.
Thetypes.Model
base abstract base class now also has the methodsobs_dist
andmarginalize
for better autocomplete support in editors.
- Remove plain test risk.
- (types) Improve type hints for inv. pattern.
- Rename "diagnose" to "diagnosis" when noun.
When used as a noun, "diagnosis" is correct, not "diagnose".
1.1.0 - 2024-03-20
- (utils) Add
safe_set_params()
function.
This checks whether the params are a dict, list, or None and handles them accordingly. Just a convencience method that helped refactor some methods. - Allow to pass state distributions to
posterior_state_dist()
andrisk()
methds. Fixes #80.
With this, one can use precomputed state distributions to speed up computing the posterior or risk for multiple scenarios.
- Use
safe_set_params()
across models.
- Add checks for midline risk. Related #80.
- (mid) Fix wrong assumption in risk test.
1.0.0 - 2024-03-18
- (uni) Catch error when
apply
to empty data. Fixes #79.
For some reason, usingapply
on an emptyDataFrame
has an entirely different return type than when it is not empty. This caused the issue #79 and has now been fixed. - (bi) Data reload loads wrong side.
Now the data does not get reloaded anymore, which was actually unnecessary in the first place. - (uni) Return correctly in
get_spread_params
. - (mid) Consume & return params in same order.
- (uni) Allow
mapping=None
when loading data.
- (uni) Check if loading empty data works. Related #79.
- (uni) Make sure likelihood is deterministic.
- ⚠ BREAKING (uni) Shorten two (unused) method names.
- ⚠ BREAKING
helpers
are nowutils
. - (type) Add type definition for graph dict.
- (diag) Use partials to save parametric dist.
- Branch 'main' into 'dev'.
- Branch '79-loading-an-empty-dataframe-raises-error' into 'dev'.
1.0.0.rc2 - 2024-03-06
Implementing the [lymixture] brought to light a shortcoming in the way the data and diagnose matrices are computed and stored. As mentioned in issue #77, their rows are now aligned with the patient data, which may have some advantages for different use cases.
Also, since this is probably the last pre-release, I took the liberty to go over some method names once more and make them clearer.
- Don't use fake T-stage for BN model. Related #77.
Since we now have access to the full diagnose matrix by default, there is no need for the Bayesian network T-stage fix anymore. - (uni) Reload data when modalities change.
Because we only store those diagnoses that are relevant to the model under the "_model" header in thepatient_data
table, we need to reload the patient data whenever we modify the modalities.
- Update to slightly changed API.
- (bi) Add bilateral quickstart to docs.
- (mod) Add utils to check for modality changes.
- (uni) Make data & diagnose matrices faster. Related #77.
The last change caused a dramatic slowdown (factor 500) of the data and diagnose matrix access, because it needed to index them from aDataFrame
. Now, I implemented a basic caching scheme with a patient data cache version that brought back the original speed. Also, apparentlydel dataframe[column]
is much slower thandataframe.drop(columns)
. I replaced the former with the latter and now the tests are fast again.
- ⚠ BREAKING Rename methods for brevity & clarity.
Method names have been changed, e.gcomp_dist_evolution()
has been renamed tostate_dist_evo()
which is both shorter and (imho) clearer. - (uni) Move data/diag matrix generation.
- Update to slightly changed API.
- (uni) Check reset of data on modality change.
Added a test to make sure the patient data gets reloaded when the modalities change. This test is still failing. - Finally suppress all
PerformanceWarnings
.
- ⚠ BREAKING Store data & diagnose matrices in data. Fixes #77.
Instead of weird, dedicatedUserDict
s, I simply use the patient data to store the data encoding and diagnose probabilities for each patient. This has the advantage that the entire matrix (irrespective of T-stage) is aligned with the patients. - ⚠ BREAKING (bi) Shorten kwargs.
The(uni|ipsi|contra)lateral_kwargs
in theBilateral
constructor were shortened by removing the "lateral".
- Branch 'main' into 'dev'.
- Branch '77-diagnose-matrices-not-aligned-with-data' into 'dev'.
- Unused helpers.
1.0.0.rc1 - 2024-03-04
This release hopefully represents the last major change before releasing version 1.0.0. It was necessary because during the implementation of the midline model, managing the symmetries in a transparent and user-friendly way became impossible in the old implementation.
Now, a composite pattern is used for both the modalities and the distributions over diagnose times. This furhter separates the logic and will allow more hierarchical models based on the ones provided here to work seamlessly almost out of the box. This may become relevant with the mixture model.
- First version of midline module added.
- (diag) Delete frozen distributions when params change.
- (diag) Correct max time & params.
Themax_time
is now correctly accessed and set. Also, the distribution params are not used up by synched distributions, but only by the distributions in composite leafs. - (graph) Avoid warning for micro mod setting.
- ⚠ BREAKING Make likelihood work with emcee again.
The way the likelihood was defined, it did not actually play nicely with how the emcee package works. This is now fixed. - (bi) Fix uninitialized
is_symmetric
dict. - (mid) Add missing dict in init.
- (mid) Update call to
transition_matrix()
&state_list
. - (mid) Finish
draw_patients
method.
Some bugs in the method for drawing synthetic patients from theMidline
were fixed. This seems to be working now.
- (mid) Improve midline docstrings slightly.
- Go over
set_params()
docstrings. - Update quickstart guide to new API.
- Adapt tests to new API (now passing).
- Update index & fix some docstrings.
- Fix some typos and cross-references.
- (helper) Add
popfirst()
andflatten()
.
Two new helper function in relation to getting and setting params. - (type) Add model ABC to inherit from.
I added an abstract base class from which all model-like classes should inherit. It defines all the methods that need to be present in a model.
The idea behind this is that any subclass of this can be part of a composite that correctly delegates getting/setting parameters, diagnose time distributions, and modalities. - ⚠ BREAKING (graph) Add
__hash__
to edge, node, graph.
This replaces the dedicatedparameter_hash()
method. - (mod) Add method to delete modality
del_modality()
. - Add more get/set params methods.
- (mid) Implement
set_params
. - (mid) Implement the
load_patient_data
meth. - (mid) Finish midline (feature complete).
- Complete set/get methods on model classes.
TheUnilateral
,Bilateral
, andMidline
model now all have the six methodsset_tumor_spread_params
,set_lnl_spread_params
,set_spread_params
,set_params
,get_tumor_spread_params
,get_lnl_spread_params
,get_spread_params
, andget_params
. - (mid) Reimplement the midline evolution.
The midline evolution that Lars Widmer worked on is now reimplemented. However, although this implementation is analogous to the one used in previsou version of the code and should thus work, it is still untested at this point. - Add helper to draw diagnoses.
The new helper functiondraw_diagnoses
is a re-implementation of theUnilateral
class's method with the same name for easier reusing. - (mid) Allow marginalization over unknown midline extension.
This is implemented differently than before: If data with unknown midline extension is added, it gets loaded into an attribute namedunknown
, which is aBilateral
model only used to store that data and generate diagnose matrices.
- Move timing data.
- Make changelog super detailed.
- (mid) Split likelihood method.
- Fix long-running test.
- Add integration tests with emcee.
- Add checks for bilateral symmetries.
- (mid) Add first check of
set_params()
method. - (mid) Check likelihood function.
- Added doc strings.
- Non-mixture midline implemented.
fixed the non mixture midline extension model and added documentation - ⚠ BREAKING Make
get_params()
uniform and chainable.
The API of allget_params()
methods is now nice and uniform, allowing arbitrary chaining of these methods. - ⚠ BREAKING Make
set_params()
uniform and chainable.
The API of allset_params()
methods is now nice and uniform, allowing arbitrary chaining of these methods. - ⚠ BREAKING Make
set_params()
not return kwargs.
It does make sense to "use up" the positional arguments one by one in theset_params()
methods, but doing the same thing with keyword arguments is pointless, difficult and error prone. - ⚠ BREAKING (graph) Replace
name
withget_name()
.
In theEdge
class, thename
property is replaced by a functionget_name()
that is more flexible and allows us to have edge names without underscores when we need it. - ⚠ BREAKING (bi) Reintroduce
is_symmetric
attribute.
This will once again manage the symmetry of theBilateral
class's different ipsi- and contralateral attributes. - ⚠ BREAKING (diag) Use composite for distributions.
Instead of a dict that holds the T-stages and corresponding distributions over diagnose times, this implements them as a composite pattern. This replaces the dict-like API entirely with methods. This has several advantages:- It is more explicit and thus more readable
- The composite pattern is designed to work naturally with tree-like structures, which we have here when dealing with bilateral models.
- ⚠ BREAKING (mod) Use composite for modalities.
Instead of a dict that holds the names and corresponding sens/spec for diagnostic modalities, this implements them as a composite pattern. This replaces the dict-like API entirely with methods. This has several advantages:- It is more explicit and thus more readable
- The composite pattern is designed to work naturally with tree-like structures, which we have here when dealing with bilateral models.
- ⚠ BREAKING (uni) Transform to composite pattern.
Use the new composite pattern for the distribution over diagnose times and modalities. - (bi) Update for new composite API.
- ⚠ BREAKING (mod) Shorten to sens/spec.
Also, add aclear_modalities()
and aclear_distributions()
method to the respective composites. - (matrix) Use hashables over arg0 cache.
Instead of using this weirdarg0_cache
for the observation and transition matrix, I use the necessary arguments only, which are all hashable now. - ⚠ BREAKING Adapt risk to likelihood call signature.
- (type) Add risk to abstract methods.
- (type) Abstract methods raise error.
- Branch 'yoel-dev' into 'dev'.
- Branch '74-synchronization-is-unreadable-and-error-prone' into 'dev'. Fixes #74.
- Branch 'main' into 'dev'.
- Branch 'add-midext-evolution' into 'dev'.
- Unused helper functions.
1.0.0.a6 - 2024-02-15
With this (still alpha) release, we most notably fixed a long unnoticed bug in the computation of the Bayesian network likelihood.
- (uni) Leftover
kwargs
now correctly returned inassign_params()
- ⚠ BREAKING (uni) Remove
is_<x>_shared
entirely, as it was unused anyways. Fixes #72. - T-stage mapping may be dictionary or callable
- (uni) Raise exception when there are no tumors or LNLs in graph
- Fix typo in modalities
- (uni) Check constructor raises exceptions
- Check the Bayesian network likelihood
- (uni) Trinary params are shared by default
- (uni) Prohibit setting
max_time
- ⚠ BREAKING Change
likelihood()
API: We don't allow setting the data via thelikelihood()
anymore. It convoluted the method and setting it beforehand is more explicit anyways.
1.0.0.a5 - 2024-02-06
In this alpha release we fixed more bugs and issues that emerged during more rigorous testing.
Most notably, we backed away from storing the transition matrix in a model's instance. Because it created opaque and confusion calls to functions trying to delete them when parameters were updated.
Instead, the function computing the transition matrix is now globally cached using a hash function from the graph representation. This has the drawback of slightly more computation time when calculating the hash. But the advantage is that e.g. in a bilateral symmetric model, the transition matrix of the two sides is only ever computed once when (synched) parameters are updated.
- (graph) Assume
nodes
is dictionary, not a list. Fixes #64. - (uni) Update
draw_patients()
method to output LyProX style data. Fixes #65. - (bi) Update bilateral data generation method to also generate LyProX style data. Fixes #65.
- (bi) Syntax error in
init_synchronization
. Fixes #69. - (uni) Remove need for transition matrix deletion via a global cache. Fixes #68.
- (uni) Use cached matrices & simplify stuff. Fixes #68.
- (uni) Observation matrix only property, not cached anymore
- Fix typos & formatting errors in docstrings
- (graph) Implement graph hash for global cache of transition matrix
- (helper) Add an
arg0
cache decorator that caches based on the first argument only - (matrix) Use cache for observation & diagnose matrices. Fixes #68.
- Update dependencies & classifiers
- Variables inside
generate_transition()
- Make doctests discoverable by unittest
- Update tests to changed API
- (uni) Assert format & distribution of drawn patients
- (uni) Allow larger delta for synthetic data distribution
- (bi) Check bilateral data generation method
- Check the bilateral model with symmetric tumor spread
- Make sure delete & recompute synced edges' tensor work
- Adapt tests to changed
Edge
API - (bi) Evaluate transition matrix recomputation
- Update tests to match new transition matrix code
- Update trinary unilateral tests
- ⚠ BREAKING Compute transition tensor globally. Fixes #69.
- ⚠ BREAKING Make transition matrix a method instead of a property. Fixes #40.
- ⚠ BREAKING Make observation matrix a method instead of a property. Fixes #40.
- Add coverage test dependency back into project
- Unused files and directories
1.0.0.a4 - 2023-12-12
- Use
lnls.keys()
consistently everywhere - Warn about symmetric params in asymmetric graph
- Make
allowed_states
accessible - Provide
base
keyword argument tocompute_encoding()
. This is necessary for the trinary model (see #45) - Ensure confusion matrix of trinary diagnostic modality has correct shape
- Make diagnostic encoding always binary
- Correct joint state/diagnose matrix (fixes #61)
- Send kwargs to both
assign_params
methods (fixes #60) - Enable two-way sync between lookup dicts (fixes #62)
- Add "see also" to get/set methods, thereby making them reference each other
- Add trinary & keywords in encoding: When computing the risk for a certain pattern in a trinary model, one may now provide different kewords like
"macro"
to differentiate between different involvements of interest. - Add convenience constructors to create
binary
andtrinary
bilateral models - Allow bilateral model with an asymmetric graph structure
- Add get/set methods to
DistributionsUserDict
, which makes allget_params()
andset_params()
methods consistent across their occurences
- Pull initialization of ipsi- & contralateral models out of
Bilateral
model's__init__()
- Restructure
Bilateral
model's__init__()
method slightly
- Cover bilateral risk computation
- Cover unilateral risk method
- Check asymmetric model implementation
- Check binary/trinary &
allowed_states
- Add trinary likelihood test
- Add risk check for trinary model
- Add checks for delegation of attrbutes & setting of params
- Check
cached_property
delegation works - Check param assign thoroughly
- Don't use custom subclass of
cached_property
that forbids setting and use the defaultcached_property
instead - Encode symmetries of
Bilateral
model in a special dict calledis_summetric
with keys"tumor_spread"
,"lnl_spread"
, and"modalities"
1.0.0.a3 - 2023-12-06
Fourth alpha release. @YoelPH noticed some more bugs that have been fixed now. Most notably, the risk prediction raised exceptions, because of a missing transponed matrix .T
.
- Raise
ValueError
if diagnose time parameters are invalid (Fixes #53) - Use names of LNLs in unilateral
comp_encoding()
(Fixes #56) - Wrong shape in unilateral posterior computation (missing
.T
) (Fixes #57) - Wrong shape in bilateral joint posterior computation (missing
.T
) (Fixes #57)
- Add info on diagnose time distribution's
ValueError
ValueError
raised in diagnose time distribution'sset_params
- Check
comp_encoding_diagnoses()
for shape and dtype - Test unilateral posterior state distribution for shape and sum
- Test bilateral posterior joint state distribution for shape and sum
1.0.0.a2 - 2023-09-15
Third alpha release. I am pretty confident that the lymph.models.Unilateral
class works as intended since it does yield the same results as the 0.4.3
version.
The lymph.models.Bilateral
class is presumably finished now as well, although there may still be issues with that. It does however compute a likelihood if asked to do so, and the results don't look implausible. So, it might be worth giving it a spin.
Also, I am now quite satisfied with the look and usability of the new API. Hopefully, this means only minor changes from here on.
- (bi) Sync callback was wrong way around
- Assigning new
modalities
now preserves thetrigger_callbacks
- Set
diag_time_dists
params won't fail anymore - (bi) Don't change dict size during
modalities
sync - (bi) Delegted generator attribute now resets
- (bi) Make
modalities
/diag_time_dists
syncable - (uni) Evolution is now running through all time-steps
- Switch to MyST style sphinx theme
- 🛠️ Start with bilateral quickstart guide
- (uni) Reproduce llh with old and new model
- Re-implement bilateral model class
- (bi) Continue rewriting bilateral class
- (helper) Add
DelegatorMixin
to helpers - (uni) Use delegator to pull graph attrs up
- (bi) Add delegation of uni attrs to bilateral
- (bi) Reimplement joint state/obs dists & llh
- (uni) Allow global setting of micro & growth
- (uni) Reimplement Bayesian network model
- (log) Add logging to sync callback creation
- Get params also as iterator
- (uni) Get only edge/dist params
state_list
is now a member of thegraph.Representation
& computation of involvement pattern encoding is separate function now- subclasses of
cached_property
are used for e.g. transition and observation matrix instead of convoluted descriptors
- (uni) Add tests w.r.t. delegator mixin
- (bi) Check the delegation of ipsi attrs
- (bi) Check sync for bilateral model
- Refactor out fixtures from test suite
- Make sure bilateral llh is deterministic
- Catch warnings for cleaner output
- (uni) Add likelihood value tests
assign_params
& joint posterior- ⚠ BREAKING (graph) Remove
edge_params
lookup in favour of anedge
dictionary in thegraph.Representation
- ⚠ BREAKING The edge's and dist's
get_params()
andset_params()
methods now have the same function signature, making a combined loop over both possible - (bi) Rewrite the bilateral risk method
- ⚠ BREAKING Allow setting params as positional & keyword arguments in both the likelihood and the risk method
- Bump codecov action to v3
- Branch 'main' into 'dev'
- Branch 'dev' into 'reimplement-bilateral'
- Branch 'delegation-pattern' into 'dev'
- Branch 'dev' into 'reimplement-bilateral'
- Branch 'remove-descriptors' into 'reimplement-bilateral'
- Branch 'reimplement-bilateral' into 'dev'
1.0.0.a1 - 2023-08-30
Second alpha release, aimed at testing the all new implementation. See these issues for an idea of what this tries to address.
- (matrix) Wrong shape of observation matrix for trinary model
- Fix wrong python version in rtd config file
- Remove outdated sampling tutorial
- Remove deprecated read-the-docs config
- Tell read-the-docs to install extra requirements
- Execute quickstart notebook
- Check correct shapes for trinary model matrices
1.0.0.a0 - 2023-08-15
This alpha release is a reimplementation most of the package's API. It aims to solve some issues that accumulated for a while.
- parameters can now be assigned centrally via a
assign_params()
method, either using args or keyword arguments. This resolves #46 - expensive operations generally look expensive now, and do not just appear as if they were attribute assignments. Fixes #40
- computations around the the likelihood and risk predictions are now more modular. I.e., several conditional and joint probability vectors/matrices can now be computed conveniently and are not burried in large methods. Resolves isse #41
- support for the trinary model was added. This means lymph node levels (LNLs) can be in one of three states (healthy, microscopic involvement, macroscopic metatsasis), instead of only two (healthy, involved). Resolves #45
- module, class, method, and attribute docstrings should now be more detailed and helpful. We switched from strictly adhering to Numpy-style docstrings to something more akin to Python's core library docstrings. I.e., parameters and behaviour are explained in natural language.
- quickstart guide has been adapted to the new API
- all matrices related to the underlying hidden Markov model (HMM) have been decoupled from the
Unilateral
model class - the representation of the directed acyclic graph (DAG) that determined the directions of spread from tumor to and among the LNLs has been implemented in a separate class of which an instance provides access to it as an attribute of
Unilateral
- access to all parameters of the graph (i.e., the edges) is bundled in a descriptor holding a
UserDict
Almost the entire API has changed. I'd therefore recommend to have a look at the quickstart guide to see how the new model is used. Although most of the core concepts are still the same.
0.4.3 - 2022-09-02
- incomplete involvement for unilateral risk method does not raise KeyError anymore. Fixes issue #38
0.4.2 - 2022-08-24
- fix the issue of docs failing to build
- remove outdated line in install instructions
- move conf.py back into source dir
- bundle sphinx requirements
- update the quickstart & sampling notebooks
- more stable sphinx-build & update old index
- fine-tune git-chglog settings to my needs
- start with a CHANGELOG
- add description to types of allowed commits
0.4.1 - 2022-08-23
- pyproject.toml referenced wrong README & LICENSE
0.4.0 - 2022-08-23
- delete unnecessary utils
- fix pyproject.toml typo
- add pre-commit hook to check commit msg