An experiment in generating an empty list default for multivalued slots in a LinkML model.
It is difficult or impossible in LinkML to specify that you want an empty list
as the default value for a slot that is both required
and multivalued
. This
repository contains some materials to help demonstrate this, as well as a
workaround.
For more context, see the following issues:
-
If you are using Nix flakes, run
nix develop
to install dependencies. Otherwise, ensure you have Python 3.10 installed. -
Create a virtual environment with
python -m venv venv
and activate it with. ./venv/bin/activate
. -
Install the Python dependencies with
pip install -r requirements.txt
.
-
Generate a Pydantic model by running
gen-pydantic personinfo_busted.yaml >personinfo_busted.py
. Note that setting bothmultivalued=true
andrequired=true
correctly infers the type of thealiases3
field asList[str]
(rather thanOptional[List[str]]
as foraliases2
, which is not required), but that it is not possible to set a default value of[]
directly (the Pydantic generator does not have a way to encode list values in theifabsent
attribute; attempts to do so generate the string"[]"
). -
Generate another Pydantic model by running
gen-pydantic personinfo_workaround.yaml >personinfo_workaround.py
. Note that this model sets a globally unique and incorrectly typed default onaliases3
. -
Run
gen-pydantic personinfo_workaround.yaml | sed s/\"aliases3dummy\"/[]/ >personinfo_workaround.py
. Note that now the definition ofaliases3
correctly has an empty list default value. -
Fire up Python and run this script:
from personinfo_workaround import Person p = Person(name2='Ada Byron', aliases2=['Lady Byron']) p
Note that
p
contains the expected values for all fields, including a default[]
foraliases3
.
The need to post-process like this is unfortunate, but seems to be necessary due to unresolved issues of defaults for multivalued slots in LinkML ([1], [2]).
It is possible to generate correct Pydantic models from LinkML schemata featuring required, multivalued attributes with a list-valued default, but it currently requires a transformation of the generated code.