Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple snp_transcripts in plot_diplotype_clustering_advanced() #703

Conversation

jonbrenas
Copy link
Collaborator

@jonbrenas jonbrenas commented Dec 12, 2024

Resolves #600.

@KellyLBennett and Nana Amoako needed this functionality earlier today so I sped it up a bit. There are currently no tests for plot_diplotype_clustering_advanced() so the best I can say is that it worked in the notebook. I will add tests when I have time.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@jonbrenas jonbrenas self-assigned this Dec 12, 2024
@jonbrenas jonbrenas marked this pull request as draft December 12, 2024 15:48
@jonbrenas jonbrenas marked this pull request as ready for review December 13, 2024 13:02
Copy link
Member

@alimanfoo alimanfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @jonbrenas. Couple of small comments. Could you also post a screenshot of this working?

malariagen_data/anoph/dipclust.py Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear why there are any changes to this file included in this PR?

Copy link
Collaborator Author

@jonbrenas jonbrenas Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AnophelesDipClustAnalysis needs to inherit gene_cnv from AnophelesCnvFrequencyAnalysis (see comment below) which created a loop in the inheritance tree. Because AnophelesDipClustAnalysis is a subclass of AnophelesCnvFrequencyAnalysis and AnophelesSnpFrequencyAnalysis, when AnophelesDataResource inherits AnophelesDipClustAnalysis, it also inherits them so they don't need to be included in the list of inherited classes anymore.

malariagen_data/anoph/dipclust.py Show resolved Hide resolved
malariagen_data/anoph/dipclust.py Outdated Show resolved Hide resolved
@jonbrenas
Copy link
Collaborator Author

Here is a picture (the chosen genes were Cyp6aa1 and Cyp6p15p):

Screenshot 2024-12-13 at 18 02 24

We might want to add titles to make things clearer.

…_clustering' of github.com:malariagen/malariagen-data-python into 600-adding-option-for-multiple-transcripts-to-diplotype_clustering
@jonbrenas jonbrenas marked this pull request as draft December 13, 2024 19:05
@jonbrenas jonbrenas marked this pull request as ready for review December 13, 2024 19:30
@sanjaynagi
Copy link
Collaborator

awesome.

@leehart leehart requested review from leehart and alimanfoo and removed request for alimanfoo January 24, 2025 11:25
@leehart
Copy link
Collaborator

leehart commented Jan 31, 2025

Thanks @jonbrenas . There seems to be some general disagreement around how to approach the issue of changing parameters and causing breaking changes. I'll try to formulate a diplomatic strategy in order to resolve my PR #691 , which might hopefully shed some light on agreeable way forward and also apply to this case, as a package-wide policy.

@leehart
Copy link
Collaborator

leehart commented Jan 31, 2025

@jonbrenas @alimanfoo I'm coming around to the idea that we should, as many other packages do, maintain support for deprecated parameters for a limited time (although I'm not sure how we'll decide when to drop support) but issue a DeprecationWarning preparing them for the upgrade path.

This introduces the problem of kicking the can down the road, in terms of when to eventually drop support completely, but alleviates the problem of users' code suddenly not working without any warning or clue how to fix it, assuming they haven't read the upgrade notes or latest docs.

This strategy would particularly apply to situations like this were we naturally want to change the name and nature of a parameter in order to suit some added or extended functionality. It would allow us the freedom to develop the user-facing interface (the public function signatures) towards clarity and semantic consistency without risking too much friction, allowing smoother and friendlier transitions towards API improvements in functionality without risking clarity.

But I'm still not sure what the policy should be for how long we allow deprecated parameters to linger around and clutter up the interface and codebase. I suppose we could either have a time-based approach, where we completely drop features if they have been deprecated in this manner for over, say, 6 months. Or we could have a release-based approach, where we drop deprecated features after having supported them for, say, 2 unrelated version increments. I guess this creates a sort of tsunami of delayed feature drops and a big cull whenever we bite the bullet and actually cut a major release. We could inform users ahead of time when the feature would be dropped, e.g. in the deprecation warning and docs. Another approach might be to somehow monitor the usage of the old parameter, and finally drop it when usage falls below a certain percentage, in favour of the new parameter... but that sounds hard. I suppose we could also decide on some kind of mixed policy, e.g. whichever comes first: a new major release or 6 months of deprecation.

Putting those details aside, I can certainly see how this new approach should (if adopted) reduce the number of major releases that would otherwise be made under a policy of semantic versioning and fearless development, e.g. whenever we slightly tweak the name of one parameter, as well as being generally more user-friendly.

@sanjaynagi
Copy link
Collaborator

sanjaynagi commented Feb 5, 2025 via email

@sanjaynagi
Copy link
Collaborator

sanjaynagi commented Feb 5, 2025 via email

Copy link
Collaborator

@leehart leehart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jonbrenas . I reckon this is good to go, unless there's anything more you think needs doing here.

@leehart
Copy link
Collaborator

leehart commented Feb 5, 2025

Merging soon unless objections.

@leehart leehart merged commit 0197957 into master Feb 6, 2025
10 checks passed
@leehart leehart deleted the 600-adding-option-for-multiple-transcripts-to-diplotype_clustering branch February 6, 2025 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow multiple snp_transcripts in plot_diplotype_clustering_advanced()
4 participants