Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow scripted modules to declare and lazily install pip requirements #7707

Open
allemangD opened this issue Apr 22, 2024 · 1 comment
Open
Labels
make-simple-things-simple "Make simple things simple and complex things possible" - to improve Slicer usability

Comments

@allemangD
Copy link
Contributor

allemangD commented Apr 22, 2024

Continuation of #7697; tightly coupled to #7171 and #6913.

After discussion and feedback at the weekly hangout Apr 16 (see #7697 (comment) and #7697 (comment)) I think my first issue conflated three separate tasks:

  • Declare and install dependencies
  • Run modules in an isolated subprocess (CLI modules)
  • Run modules in an isolated environment (venv, conda, poetry, etc)

I'm interested in tackling only the first of these, and since CLI have questionable value anyway it doesn't make sense to require CLI usage to get the benefit here. Really, what we want is a way for scripted module's logic to depend on pypi packages, and only install those packages when they are needed. Doing this purely in the Python layer allows all scripted modules to benefit.

Draft implementation and features

I have a Python 3.9 draft implementation of a LazyImportGroup context manager, based on the LazyLoader and lazy_import recipe in the standard library importlib documentation.

https://github.com/allemangD/lazyloader - see the doctests for LazyImportGroup for more details.

The important principles here are:

  • Consolidate. Keep all import statements at top-level, all dependencies in requirements.txt.
  • Lazy. Do not install or import expensive libraries until they are used.
  • Reliable. Do not import libraries unless they are guaranteed to be available in a user's environment.

These were not mentioned during the discussion last week, but I think they are important:

  • Support static analysis.
  • Be explicit about when "magic" occurs.

Usage requires the extension developer place a requirements.txt in a python package, installed via RESOURCES argument of the existing CMake macros. The anchor and path for that requirements.txt resource are given as an argument to LazyImportGroup context manager, wrapping top-level imports.

with LazyImportGroup('lib:requirements.txt'):
  import lib.logic
  import itk

These imports are real import statements that are detected by static analysis, and run-time magic only occurs with imports within the context manager. These imports produce proxy modules that defer install and real import until first usage:

def onApplyButton(self):
  lib.logic.do_the_task(...)
  #        ^ attribute access triggers install and import

def onComputeButton(self):
  itk.Fpfh....
  #  ^ attribute access triggers install and import

Modules imported within a LazyImportGroup can safely have bare top-level imports of their dependencies. This is probably recommended usage for this tool:

qt-scripted-modules
|- libCompute
|   |- __init__.py
|   |- requirements.txt
|   \- logic.py
\- Compute.py

Above, __init__.py and logic.py can safely include their dependencies at top level, as long as Compute.py imports them in a lazy group:

# requirements.txt
A
B
# train.py
import A
# __init__.py
import B
from .logic import ...
# Compute.py
with LazyImportGroup('libCompute:requirements.txt'):
  import libCompute
  import libCompute.logic

Only the first attribute access of a module in a group will trigger the installation for all imports in that group. There may be multiple groups with different requirements to denote multiple feature sets, and only the used features will be installed. Each group is installed at most a single time.

with LazyImportGroup('lib:training-requirements.txt'):
   import X
   import Y

with LazyImportGroup('libinference:requirements.txt'):
   import Z

The first attribute access of X or Y will install training-requirements.txt, and the first attribute access of Z will install inference-requirements.txt.

A fast pip install --no-deps --dry-run --report check is first used to ensure all the items in requirements.txt are satisfied. If so, no action is taken, the module is imported, and the proxy module is updated. If not, the requirements are first installed with pip install -r.

The dry-run report is used to produce a summary of packages to be installed. The Slicer implementation should show this to the user with slicer.util.confirmOkCancelDisplay.

Not implemented in the draft, but the report and install steps could accept a -c constraints.txt argument to pin Slicer libraries like numpy, SimpleITK, etc. Also the requirements of other groups could be inspected to detect incompatibilities with messages like: "Module X wants to install package Y which is incompatible with module Z. Proceed?"


I've tested these proxy objects on plain python packages, loadable packages, and even itk's complex lazy-loaded metapackage. It works on all of these, although there are some caveats on object identity and safe import order. See the doctests for details.

Outstanding questions

  • Where to place this? Suggest slicer.LazyImportGroup.
  • Bikeshed the name? Signature? It also does pip installations.
  • Best practices on confirmation dialog?
  • Should constraints files or compatibility checks between modules be part of this initial feature? Or should they be delayed till we see usage in practice.

From prior discussions: cc @jcfr @pieper @lassoan @ebrahimebrahim

@allemangD allemangD added the make-simple-things-simple "Make simple things simple and complex things possible" - to improve Slicer usability label Apr 22, 2024
@allemangD
Copy link
Contributor Author

allemangD commented Apr 23, 2024

Feedback and planning following the Hangout today Apr 23.

General feedback

  • really need constraints cross-checking
    • "module X is breaking the environment for module Y"
  • need to know when a package is downgraded / uninstalled
    • log this in slicer.util.pip_install so that extension developers see "module X broke everything" in the logs
    • constraints cross-checking in slicer.util.pip_install
  • option to allow companion file ${module}-requirements.txt sibling to ${module}.py. guarantees unique requirements.txt
    • Incremental step for eg. SlicerMorph with many modules. Not tenable (yet) to move logic into packages.
  • make requirements optional to trigger lazy loading without pip installations

Docs feedback

  • migration guide
    • break logic to new file
  • new extension best practices
  • migration from lazy-loaded logic to pip-installable logic.
  • update extension template
  • debugging guide.
    • halted import means the package may not be available to the user at this point. use the lazy import, or move the import to a guarded point
    • look for breaking environment messages in logs
    • brief summary of the important points on lazy import group. refer to tests as docs for nuances

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
make-simple-things-simple "Make simple things simple and complex things possible" - to improve Slicer usability
Development

No branches or pull requests

1 participant