Releases: rmnldwg/lymph
1.2.3
What's New
This is very minor and I only release this tiny update, so that I can depend on all models having the binary
and trinary
constructor in lyscripts.
Features
- (mid) Add missing
binary
constructor toMidline
model. Now all models have abinary
andtrinary
constructor.
Styling
- Add rules to ruff.
Testing
- Make suite testable with pytest.
Ci
- Switch to pytest for testing.
1.2.2
What's New
Bug Fixes
- (mid) Correct contra state dist evo. Fixes #85.
Previously, the model did not correctly marginalize over the possible
time when a tumor can grow over the midline. It simply assumed that it
did from the onset.
Documentation
- (uni) Remove outdated docstring paragraph. Fixes #88.
Miscellaneous Tasks
- Bump pre-commit versions.
Styling
- Use ruff to fix lint and format code.
Build
- Remove upper cap in deps.
Change
risk()
meth requiresinvolvement
. Fixes #87.
We figured it does not make sense to allow passinginvolvement=None
into therisk()
method just to have it return 1. This is except for
the midline class, whereinvolvement
may reasonably beNone
while
midext
isn't.
Also, I ran ruff over some files, fixing some code style issues.
1.2.1
Changelog
All notable changes to this project will be documented in this file.
What's New
Bug fixes and two tiny features.
Bug Fixes
- (uni)
load_patient_data
should acceptNone
. - (mid) Correct type hint of
marginalize
. - (graph) Wrong dict when trinary.
Theto_dict()
method returned a wrong graph dictionary when trinary
due to growth edges. This is fixed now. - Skip
marginalize
only when safe.
The marginalization should only be skipped (and 1 returned), when the
entire disease state of interest isNone
. In the midline case, this
disease state includes the midline extension.
Previously, only the involvement pattern was checked. Now, the model is
more careful about when to take shortcuts.
Features
- (graph) Modify mermaid graph.
Theget_mermaid()
andget_mermaid_url()
methods now accept arguments
that allow some modifications of the output. - (uni) Add
__repr__()
.
Refactor
- (uni) Use pandas
map
instead ofapply
.
This saves us a couple of lines in theload_patient_data
method and is
more readable.
Merge
- Branch 'main' into 'dev'.
Remove
- Remains of callbacks.
Some callback functionality that was tested in a pre-release has been
forgotten in the code base and is now deleted.
1.2.0
What's New
This feature update brings methods to the models that allow a more modular use of them. Otherwise, nothing spectacular.
Bug Fixes
- (mid)
obs_dist
may return 3D array.
Documentation
- Fix unknown version in title.
- Add missing blank before list.
- (mid) Add comment about midext marginalizing.
Features
- (mid) Add
posterior_state_dist()
method.
TheMidline
model now has aposterior_state_dist()
method, too. - (types) Base
Model
has state dist methods.
Bothstate_dist()
andposterior_state_dist()
have been added to the
types.Model
base class. - Add
marginalize()
method.
With this new method, one can marginalize a (prior or posterior) state
distribution over all states that match a provided involvement.
It is used e.g. to refactor the code of therisk()
methods. - (types) Add
obs_dist
andmarginalize
.
Thetypes.Model
base abstract base class now also has the methods
obs_dist
andmarginalize
for better autocomplete support in editors.
Testing
- Remove plain test risk.
Change
- (types) Improve type hints for inv. pattern.
- Rename "diagnose" to "diagnosis" when noun.
When used as a noun, "diagnosis" is correct, not "diagnose".
Full diff: 1.1.0...1.2.0
1.1.0
What's New
With this feature update, it becomes possible to speed up repeated risk predictions by providing it with precomputed state distributions. These state distributions are the most expensive part of most models.
Features
- (utils) Add
safe_set_params()
function.
This checks whether the params are a dict, list, or None and handles
them accordingly. Just a convencience method that helped refactor some methods. - Allow to pass state distributions to
posterior_state_dist()
andrisk()
methds. Fixes #80.
With this, one can use precomputed state distributions to speed up
computing the posterior or risk for multiple scenarios.
Refactor
- Use
safe_set_params()
across models.
Testing
- Add checks for midline risk. Related #80.
- (mid) Fix wrong assumption in risk test.
Full Changelog: 1.0.0...1.1.0
1.0.0
Finally 🎉
Eventually, I did manage to decide on an API that I want to stick with for the foreseeable future.
If you have used the previous version 0.4.3, then forget everything you knew about that and head over to the documentation to learn everything from scratch. The core concepts stay the same though.
Full diff since last release candidate: 1.0.0.rc2...1.0.0
Full diff since 0.4.3: 0.4.3...1.0.0
Bug Fixes
- (uni) Catch error when
apply
to empty data. Fixes #79.
For some reason, usingapply
on an emptyDataFrame
has an entirely
different return type than when it is not empty. This caused the issue
#79 and has now been fixed. - (bi) Data reload loads wrong side.
Now the data does not get reloaded anymore, which was actually
unnecessary in the first place. - (uni) Return correctly in
get_spread_params
. - (mid) Consume & return params in same order.
- (uni) Allow
mapping=None
when loading data.
Testing
- (uni) Check if loading empty data works. Related #79.
- (uni) Make sure likelihood is deterministic.
Change
- ⚠ BREAKING (uni) Shorten two (unused) method names.
- ⚠ BREAKING
helpers
are nowutils
. - (type) Add type definition for graph dict.
- (diag) Use partials to save parametric dist.
1.0.0.rc2
What's New
Implementing the lymixture brought to light a shortcoming in the way the data and diagnose matrices are computed and stored. As mentioned in issue #77, their rows are now aligned with the patient data, which may have some advantages for different use cases.
Also, since this is probably the last pre-release, I took the liberty to go over some method names once more and make them clearer.
All changes: 1.0.0.rc1...1.0.0.rc2
Bug Fixes
- Don't use fake T-stage for BN model. Related #77.
Since we now have access to the full diagnose matrix by default, there
is no need for the Bayesian network T-stage fix anymore. - (uni) Reload data when modalities change.
Because we only store those diagnoses that are relevant to the model
under the "_model" header in thepatient_data
table, we need to reload
the patient data whenever we modify the modalities.
Documentation
- Update to slightly changed API.
- (bi) Add bilateral quickstart to docs.
Features
- (mod) Add utils to check for modality changes.
Performance
- (uni) Make data & diagnose matrices faster. Related #77.
The last change caused a dramatic slowdown (factor 500) of the data and
diagnose matrix access, because it needed to index them from a
DataFrame
. Now, I implemented a basic caching scheme with a patient
data cache version that brought back the original speed.
Also, apparentlydel dataframe[column]
is much slower than
dataframe.drop(columns)
. I replaced the former with the latter and now
the tests are fast again.
Refactor
- ⚠ BREAKING Rename methods for brevity & clarity.
Method names have been changed, e.gcomp_dist_evolution()
has been
renamed tostate_dist_evo()
which is both shorter and (imho) clearer. - (uni) Move data/diag matrix generation.
Testing
- Update to slightly changed API.
- (uni) Check reset of data on modality change.
Added a test to make sure the patient data gets reloaded when the
modalities change. This test is still failing. - Finally suppress all
PerformanceWarnings
.
Change
- ⚠ BREAKING Store data & diagnose matrices in data. Fixes #77.
Instead of weird, dedicatedUserDict
s, I simply use the patient data
to store the data encoding and diagnose probabilities for each patient.
This has the advantage that the entire matrix (irrespective of T-stage)
is aligned with the patients. - ⚠ BREAKING (bi) Shorten kwargs.
The(uni|ipsi|contra)lateral_kwargs
in theBilateral
constructor
were shortened by removing the "lateral".
Merge
- Branch 'main' into 'dev'.
- Branch '77-diagnose-matrices-not-aligned-with-data' into 'dev'.
Remove
- Unused helpers.
1.0.0.rc1
What's New
This release hopefully represents the last major change before releasing version 1.0.0. It was necessary because during the implementation of the midline model, managing the symmetries in a transparent and user-friendly way became impossible in the old implementation.
Now, a composite pattern is used for both the modalities and the distributions over diagnose times. This furhter separates the logic and will allow more hierarchical models based on the ones provided here to work seamlessly almost out of the box. This may become relevant with the mixture model.
Big thanks to @YoelPH for implementing large parts of the midline model! 👏🏻
The full diff can be found here: 1.0.0.a6...1.0.0.rc1
Add
- Midline module added. This makes the code now feature complete again (compared to version 0.4.3). It also implements the evolution of the midline extension as random variable.
Bug Fixes
- (diag) Delete frozen distributions when params change.
- (diag) Correct max time & params.
Themax_time
is now correctly accessed and set. Also, the distribution
params are not used up by synched distributions, but only by the
distributions in composite leafs. - (graph) Avoid warning for micro mod setting.
- ⚠ BREAKING Make likelihood work with emcee again.
The way the likelihood was defined, it did not actually play nicely with
how the emcee package works. This is now fixed. - (bi) Fix uninitialized
is_symmetric
dict. - (mid) Add missing dict in init.
- (mid) Update call to
transition_matrix()
&state_list
. - (mid) Finish
draw_patients
method.
Some bugs in the method for drawing synthetic patients from the
Midline
were fixed. This seems to be working now.
Documentation
- (mid) Improve midline docstrings slightly.
- Go over
set_params()
docstrings. - Update quickstart guide to new API.
- Adapt tests to new API (now passing).
- Update index & fix some docstrings.
- Fix some typos and cross-references.
Features
- (helper) Add
popfirst()
andflatten()
.
Two new helper function in relation to getting and setting params. - (type) Add model ABC to inherit from.
I added an abstract base class from which all model-like classes should
inherit. It defines all the methods that need to be present in a model.
The idea behind this is that any subclass of this can be part of a
composite that correctly delegates getting/setting parameters,
diagnose time distributions, and modalities. - ⚠ BREAKING (graph) Add
__hash__
to edge, node, graph.
This replaces the dedicatedparameter_hash()
method. - (mod) Add method to delete modality
del_modality()
. - Add more get/set params methods.
- (mid) Implement
set_params
. - (mid) Implement the
load_patient_data
meth. - (mid) Finish midline (feature complete).
- Complete set/get methods on model classes.
TheUnilateral
,Bilateral
, andMidline
model now all have the six
methodsset_tumor_spread_params
,set_lnl_spread_params
,
set_spread_params
,set_params
,get_tumor_spread_params
,
get_lnl_spread_params
,get_spread_params
, andget_params
. - (mid) Reimplement the midline evolution.
The midline evolution that Lars Widmer worked on is now reimplemented.
However, although this implementation is analogous to the one used in
previsou version of the code and should thus work, it is still untested
at this point. - Add helper to draw diagnoses.
The new helper functiondraw_diagnoses
is a re-implementation of the
Unilateral
class's method with the same name for easier reusing. - (mid) Allow marginalization over unknown midline extension.
This is implemented differently than before: If data with unknown
midline extension is added, it gets loaded into an attribute named
unknown
, which is aBilateral
model only used to store that data and
generate diagnose matrices.
Miscellaneous Tasks
- Move timing data.
- Make changelog super detailed.
Refactor
- (mid) Split likelihood method.
Testing
- Fix long-running test.
- Add integration tests with emcee.
- Add checks for bilateral symmetries.
- (mid) Add first check of
set_params()
method. - (mid) Check likelihood function.
Add
- Added doc strings.
Change
- Non-mixture midline implemented.
fixed the non mixture midline extension model and added documentation - ⚠ BREAKING Make
get_params()
uniform and chainable.
The API of allget_params()
methods is now nice and uniform, allowing
arbitrary chaining of these methods. - ⚠ BREAKING Make
set_params()
uniform and chainable.
The API of allset_params()
methods is now nice and uniform,
allowing arbitrary chaining of these methods. - ⚠ BREAKING Make
set_params()
not return kwargs.
It does make sense to "use up" the positional arguments one by one in
theset_params()
methods, but doing the same thing with keyword
arguments is pointless, difficult and error prone. - ⚠ BREAKING (graph) Replace
name
withget_name()
.
In theEdge
class, thename
property is replaced by a function
get_name()
that is more flexible and allows us to have edge names
without underscores when we need it. - ⚠ BREAKING (bi) Reintroduce
is_symmetric
attribute.
This will once again manage the symmetry of theBilateral
class's
different ipsi- and contralateral attributes. - ⚠ BREAKING (diag) Use composite for distributions.
Instead of a dict that holds the T-stages and corresponding
distributions over diagnose times, this implements them as a composite
pattern. This replaces the dict-like API entirely with methods. This has
several advantages:- It is more explicit and thus more readable
- The composite pattern is designed to work naturally with tree-like
structures, which we have here when dealing with bilateral models.
- ⚠ BREAKING (mod) Use composite for modalities.
Instead of a dict that holds the names and corresponding
sens/spec for diagnostic modalities, this implements them as a composite
pattern. This replaces the dict-like API entirely with methods. This has
several advantages:- It is more explicit and thus more readable
- The composite pattern is designed to work naturally with tree-like
structures, which we have here when dealing with bilateral models.
- ⚠ BREAKING (uni) Transform to composite pattern.
Use the new composite pattern for the distribution over diagnose times
and modalities. - (bi) Update for new composite API.
- ⚠ BREAKING (mod) Shorten to sens/spec.
Also, add aclear_modalities()
and aclear_distributions()
method to
the respective composites. - (matrix) Use hashables over arg0 cache.
Instead of using this weirdarg0_cache
for the observation and
transition matrix, I use the necessary arguments only, which are all
hashable now. - ⚠ BREAKING Adapt risk to likelihood call signature.
- (type) Add risk to abstract methods.
- (type) Abstract methods raise error.
Merge
- Branch 'yoel-dev' into 'dev'.
- Branch '74-synchronization-is-unreadable-and-error-prone' into 'dev'. Fixes #74.
- Branch 'main' into 'dev'.
- Branch 'add-midext-evolution' into 'dev'.
Remove
- Unused helper functions.
1.0.0.a6
What's New
With this (still alpha) release, we most notably fixed a long unnoticed bug in the computation of the Bayesian network likelihood.
Bug Fixes
- (uni) Leftover
kwargs
now correctly returned inassign_params()
- ⚠ BREAKING (uni) Remove
is_<x>_shared
entirely, as it was unused anyways. Fixes #72. - T-stage mapping may be dictionary or callable
- (uni) Raise exception when there are no tumors or LNLs in graph
Documentation
- Fix typo in modalities
Testing
- (uni) Check constructor raises exceptions
- Check the Bayesian network likelihood
Change
- (uni) Trinary params are shared by default
- (uni) Prohibit setting
max_time
- ⚠ BREAKING Change
likelihood()
API: We don't allow setting the data via thelikelihood()
anymore. It convoluted the method and setting it beforehand is more explicit anyways.
1.0.0.a5
What's New
In this alpha release we fixed more bugs and issues that emerged during more rigorous testing.
Most notably, we backed away from storing the transition matrix in a model's instance. Because it created opaque and confusion calls to functions trying to delete them when parameters were updated.
Instead, the function computing the transition matrix is now globally cached using a hash function from the graph representation. This has the drawback of slightly more computation time when calculating the hash. But the advantage is that e.g. in a bilateral symmetric model, the transition matrix of the two sides is only ever computed once when (synched) parameters are updated.
Bug Fixes
- (graph) Assume
nodes
is dictionary, not a list. Fixes #64 - (uni) Update
draw_patients()
method to output LyProX style data. Fixes #65 - (bi) Update bilateral data generation method to also generate LyProX style data. Fixes #65
- (bi) Syntax error in
init_synchronization
. Fixes #69 - (uni) Remove need for transition matrix deletion via a global cache. Fixes #68
- (uni) Use cached matrices & simplify stuff. Fixes #68
- (uni) Observation matrix only property, not cached anymore
Documentation
- Fix typos & formatting errors in docstrings
Features
- (graph) Implement graph hash for global cache of transition matrix
- (helper) Add an
arg0
cache decorator that caches based on the first argument only - (matrix) Use cache for observation & diagnose matrices. Fixes #68
Miscellaneous Tasks
- Update dependencies & classifiers
Refactor
- Variables inside
generate_transition()
Testing
- Make doctests discoverable by unittest
- Update tests to changed API
- (uni) Assert format & distribution of drawn patients
- (uni) Allow larger delta for synthetic data distribution
- (bi) Check bilateral data generation method
- Check the bilateral model with symmetric tumor spread
- Make sure delete & recompute synced edges' tensor work
- Adapt tests to changed
Edge
API - (bi) Evaluate transition matrix recomputation
- Update tests to match new transition matrix code
- Update trinary unilateral tests
Change
- ⚠ BREAKING Compute transition tensor globally. Fixes #69
- ⚠ BREAKING Make transition matrix a method instead of a property. Fixes #40
- ⚠ BREAKING Make observation matrix a method instead of a property. Fixes #40
Ci
- Add coverage test dependency back into project
Remove
- Unused files and directories