Skip to content

Commit

Permalink
document POPROX pipeline deviation
Browse files Browse the repository at this point in the history
  • Loading branch information
mdekstrand committed Jan 11, 2025
1 parent 275b43b commit b287d65
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 0 deletions.
38 changes: 38 additions & 0 deletions docs/guide/pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -441,3 +441,41 @@ Finally, you can directly pass configuration parameters to the component constru
}>

See :ref:`conventions` for more conventions for component design.

POPROX and Other Integrators
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

One of LensKit's :ref:`design principles <principles>` is “use the pieces you
want”. That extends to the pipeline code — while the pipeline components
included with LensKit use LensKit's data structures like
:class:`~lenskit.data.ItemList` and :class:`~lenskit.data.RecQuery`, the
pipeline itself is fully generic. Components can accept and return any types,
and the pipeline code makes no assumptions about the kinds of data routed
through the pipeline, the structure of the pipeline, or the presence or absence
of any particular components. The only aspects of component interface or
behavior defined by the pipeline are that:

- Pipeline objects are callable, and accept their inputs as keyword parameters.
- Configurable components extend the :class:`Component` interface and use
Pydantic models to house their configurable options (with its requirements,
such as defining a ``config`` attribute to store the configuration).
- Components can be constructed with either zero arguments or a single
configuration model argument.

The exception to this is training support — :meth:`Pipeline.train` takes a
LensKit dataset and trains components implementing the
:class:`~lenskit.training.Trainable` protocol. But it is entirely possible to
handle model training outside of the pipeline and ignore LensKit ``train``
method. You can also use the method, but with a different input data object; it
will fail static typechecking, but :meth:`Pipeline.train` doesn't actually care
what the type of its first argument is, and will pass it as-is to the component
``train()`` methods.

One example of an integrator that uses the pipeline without the rest of
LensKit's data structures is _POPROX: the POPROX recommender design uses its own
data structures, like a Pydantic-backed ``ArticleSet``, instead of
:class:`~lenskit.data.ItemList` and friends, and expects components to be
pre-trained by other code. It still uses the LensKit pipeline to wire these
components together.

.. _POPROX: https://docs.poprox.ai/reference/recommender/pipeline.html
2 changes: 2 additions & 0 deletions docs/guide/principles.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _principles:

Design Goals and Principles
===========================

Expand Down

0 comments on commit b287d65

Please sign in to comment.