How configurable should the culler be? #9

rkdarst · 2020-07-15T08:57:55Z

Now that the culler is installable, a question is: how should the culler be customized? Previously, it was via making a copy and modifying it. Now, there is one blessed, easy to install version - what happens to others?

In one way, I am very happy it's a separate repo now, so that it's straightforward (possibly not easy, though) to maintain a long-running fork. On the other hand, it means now I need to actually think about what's the best way, and coordinate...

Alternative options

Based on really quick thinking:

Do nothing, blessed version is installable, other versions installed manually via pip install https://... from own github fork in special branch.
Encourage different custom situations to be submitted upstream. But, then we need to decide what is worth accepting, and then we get into a huge mess where we decide an API without much planning (e.g. in my case, I use a culltime element in the spawner state to decide when it can be culled.)
Encourage/support people to still copy/(rename) the script file when needed. Naming the script init.py doesn't encourage this much, but this does have a benefit that it's clear that it's a different thing. I can imagine that (1) will get confusing when the same module intentionally has different functions on different people's systems, but we can just accept that.

Who would use this feature?

I have two hubs and they both have customized cullers. I've seen enough other issues about the culler that I imagine that others customize it, too.

(Optional): Suggest a solution

At least document how we suggest local customizations
Think about what kind of customizations we would accept. e.g. new command line options=OK, things that dig into spawner state to make culling choices=not maintainable.
I would feel happy renaming init.py to back to cull_idle_servers.py and importing the minimum in init, but I don't think that can be justified (to make cases where this is copied manually elsewhere more obvious).

I hope to make a PR soon that will demonstrate something we probably don't take.

The text was updated successfully, but these errors were encountered:

welcome · 2020-07-15T08:57:56Z

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.

You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

manics · 2020-07-15T09:24:57Z

This sounds similar to the discussions in #8 and #4 which were leaning towards some sort of plugin system, where jupyterhub-idle-culler almost becomes a Jupyter cronjob manager, though more structured than juts running scripts.

rkdarst · 2020-07-15T18:39:44Z

To me, a plugin system seemed like a bit overkill - I see JupyterHub as the "cron manager" that runs the scripts. But it is a good point that there is a lot of duplication to do a little bit more. I'll comment there, but I think the question of what to do in the core would still be relevant. but we'll see...

1kastner · 2020-07-16T08:48:52Z

In #8 we first discussed the option of a plugin system but I am definitely open to other versions as well. I prefer a system where it is easy to find reference implementations. Hence I support alternative 2 to have some versions upstream or maybe another alternative and in addition a collection of the most noteworthy forks somewhere, maybe in the readme?

My reasoning is this: Whoever is new to Jupyter in general and JupyterHub in specific might feel overwhelmed when they need to gather all information by browsing random github repositories (I mean probably interesting forks). I prefer that the jupyterhub-idle-culler project (in some abstract sense) offers some guidance for those new users. This reduces the re-invention of the wheel over and over again. I understand that some scripts might be very instance-specific and at the same time I like to share my results with fellow Jovians in an efficient way and adapt solutions from others as easily as possible.

Openly speaking I do not have deep knowledge about with which programming patterns this aim is achieved in the most time-efficient way. For me people (i.e. communication and sharing existing solution) come before the beauty of code. To avoid confusion about what the culler service does, maybe logging can help? Still we need to be careful though.

1kastner · 2020-08-23T08:22:56Z

I am curious about your opinions as well, @manics and @rkdarst !

manics · 2020-08-28T12:12:29Z

I think we can separate the discussion around avoiding user confusion from the decision on implementation.

For example, if we go down the route of separate plugins/extensions in their own repositories (and separate pypi packages) then jupyterhub-idle-culler or some other meta-package could bundle together all the recommended ones using install_requires.

Other factors to consider when deciding whether to have multiple implementations in this repo vs separate ones is ease of testing and maintenance. Regardless of which option we choose as @rkdarst mentioned we still need to decide on an API.

1kastner · 2020-08-29T12:40:23Z

That seems to be a nice idea! In another context @yuvipanda suggested a pluggy-based plugin system. Whenever we need a plugin hook in the lengthy cull_idle function (it spans from row 102 to 308) we could insert a pluggy hook which by default is a function without any effect. The places of where to insert them we could decide based on the current two ideas #4 and #8. This would leave us with the question of which parameters will be handed it since the API should be more or less stable over time and that restricts the freedom of development inside this package. But I can't predict the future so I guess if the API is a bit iterative in the beginning that would be totally fine for me. I guess we need to learn from the use-cases what is really needed.

rcthomas · 2020-09-28T18:53:04Z

I was just taking a look here at how I could slot in custom cull logic and noticed that unlike a bunch of other JupyterHub stuff this isn't using traitlets. Any reason we don't just convert to that and make the cull logic a configurable Callable?

1kastner · 2020-09-29T11:19:43Z

@rcthomas none that I am aware of.

manics · 2020-09-29T11:29:49Z

I don't know how much work will be involved in converting it to use traitlets and a configuration file, but it sounds like it should be easier than implementing a plugin system.

meeseeksmachine · 2020-10-20T08:54:42Z

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/a-cull-idle-user-service-that-deletes-pvs/4742/16

consideRatio · 2021-04-17T19:54:20Z

I opened #25 to discuss adding some general purpose hooks. JupyterHub's Spawner and Authenticator class have various hooks for example that can be customized, and I think those serve a very useful hatch for users without making maintenance of the project problematic.

If someone wants to email someone as part of culling the user i think a hook is the right way to help them accomplish it.
If someone wants to cull users based on custom logic, then i think a hook returning true/false given input of server activity, username, etc would make sense.

I opened the issue as I felt #25 was as far as I think we should go with configuration flags etc, and that the rest should be left for custom logic implemented in some way from the end user.

rkdarst added the enhancement New feature or request label Jul 15, 2020

manics mentioned this issue Jul 16, 2020

Adding the option to delete persisted user data as well. #8

Open

manics mentioned this issue Oct 6, 2020

Deleting named servers should delete the corresponding PVC jupyterhub/kubespawner#446

Closed

consideRatio mentioned this issue Apr 17, 2021

Support custom culling logic through hooks (like Spawner's pre_spawn_hook etc) #25

Open

manics mentioned this issue Dec 14, 2021

Integrate jupyterhub-idle-culler with KubeIngressProxy #42

Open

manics mentioned this issue Feb 20, 2023

Consider switching to argparse for nicer command line handling and help #44

Open

manics mentioned this issue Nov 20, 2024

Replace tornado options with traitlets #83

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How configurable should the culler be? #9

How configurable should the culler be? #9

rkdarst commented Jul 15, 2020

welcome bot commented Jul 15, 2020

manics commented Jul 15, 2020 •

edited

Loading

rkdarst commented Jul 15, 2020

1kastner commented Jul 16, 2020

1kastner commented Aug 23, 2020

manics commented Aug 28, 2020

1kastner commented Aug 29, 2020

rcthomas commented Sep 28, 2020

1kastner commented Sep 29, 2020

manics commented Sep 29, 2020

meeseeksmachine commented Oct 20, 2020

consideRatio commented Apr 17, 2021

How configurable should the culler be? #9

How configurable should the culler be? #9

Comments

rkdarst commented Jul 15, 2020

Alternative options

Who would use this feature?

(Optional): Suggest a solution

welcome bot commented Jul 15, 2020

manics commented Jul 15, 2020 • edited Loading

rkdarst commented Jul 15, 2020

1kastner commented Jul 16, 2020

1kastner commented Aug 23, 2020

manics commented Aug 28, 2020

1kastner commented Aug 29, 2020

rcthomas commented Sep 28, 2020

1kastner commented Sep 29, 2020

manics commented Sep 29, 2020

meeseeksmachine commented Oct 20, 2020

consideRatio commented Apr 17, 2021

manics commented Jul 15, 2020 •

edited

Loading