Often fields in a task can depend on other values in the task. For example, a
task's max-runtime
may depend on the platform
. To handle this, you
could re-define max-runtime
in each task's definition like so:
tasks:
taskA:
platform: android
worker:
max-runtime: 7200
taskB:
platform: ios
worker:
max-runtime: 7200
taskC:
platform: windows
worker:
max-runtime: 3600
taskD:
platform: mac
worker:
max-runtime: 1800
...
This is simple, but if you have lots of tasks it's also tedious and makes updating the configuration a pain. To avoid this duplication you could use a :doc:`transform </concepts/transforms>`:
@transforms.add
def set_max_runtime(config, tasks):
for task in tasks:
if task["platform"] in ("android", "ios"):
task["worker"]["max-runtime"] = 7200
elif task["platform"] == "windows":
task["worker"]["max-runtime"] = 3600
else:
task["worker"]["max-runtime"] = 1800
yield task
This works but now we've hardcoded constants into our code logic far away from the task's original definition! Besides this is pretty verbose and it can get complicated if you want to be able to change these constants per task.
Another way to accomplish the same thing is to use Taskgraph's "keyed by"
feature. This can be used in combination with the task-defaults
key to
express the same logic directly in the kind.yml
file:
task-defaults:
worker:
max-runtime:
by-platform:
(ios|android): 7200
windows: 3600
default: 1800
tasks:
taskA:
platform: android
taskB:
platform: windows
taskC:
platform: mac
...
The structure under the by-platform
key is resolved to a single value using
the :func:`~taskgraph.util.schema.resolve_keyed_by` utility function. When
"keying by" another attribute in the task, you must call this utility later on
in a transform:
from taskgraph.util.schema import resolve_keyed_by
@transforms.add
def resolve_max_runtime(config, tasks):
for task in tasks:
resolve_keyed_by(task, "worker.max-runtime", f"Task {task['label']")
yield task
In this example, :func:`~taskgraph.util.schema.resolve_keyed_by` takes the root container object (aka, the task), the subkey to operate on, and a descriptor that will be used in any exceptions that get raised.
Exact matches are used immediately. If no exact matches are found, each
alternative is treated as a regular expression, matched against the whole
value. Thus android.*
would match android-arm/debug
. If nothing
matches as a regular expression, but there is a default
alternative, it is
used. Otherwise, an exception is raised and graph generation stops.
By default when you use the pattern by-<name>
and then feed it into
:func:`~taskgraph.util.schema.resolve_keyed_by`, <name>
is assumed to be a
valid top-level key in the task definition. However, sometimes you want to key
by some other value that is either nested deeper in the task definition, or not
even known ahead of time!
For this reason you can specify additional context via **kwargs
. Typically
it will make the most sense to use this following a prior transform that sets
some value that's not known statically. This comes up frequently when splitting
a task from one definition into several. For example:
tasks:
task:
platforms: [android, windows, mac]
worker:
max-runtime:
by-platform:
(ios|android): 7200
windows: 3600
default: 1800
@transforms.add
def split_platforms(config, tasks):
for task in tasks:
for platform in task.pop("platforms"):
new_task = deepcopy(task)
# ...
resolve_keyed_by(
new_task,
"worker.max-runtime",
task["label"],
platform=platform,
)
yield task
Here we did not know the value of "platform" ahead of time, but it was still possible to use it in a "keyed by" statement thanks to the ability to pass in extra context.
Note
A good rule of thumb is to only consider using "keyed by" in
task-defaults
or in a task definition that will be split into many
tasks down the line.
The subkey in :func:`~taskgraph.util.schema.resolve_keyed_by` is expressed in dot path notation with each part of the path representing a nested dictionary. If any part of the subkey is a list, each item in the list will be operated on. For example, consider this excerpt of a task definition:
worker:
artifacts:
- name: foo
path:
by-platform:
windows: foo.zip
default: foo.tar.gz
- name: bar
path:
by-platform:
windows: bar.zip
default: bar.tar.gz
With the associated transform:
@transforms.add
def resolve_artifact_paths(config, tasks):
for task in tasks:
resolve_keyed_by(task, "worker.artifacts.path", task["label"])
yield task
In this example, Taskgraph resolves by-platform
in both the foo and bar
artifacts.
Note
Calling resolve_keyed_by
on a subkey that doesn't contain a by-*
field is a no-op.
Having fields of a task that may or may not be keyed by another field, can cause problems for any schemas your transforms define. For that reason Taskgraph provides the :func:`~taskgraph.util.schema.optionally_keyed_by` utility function.
It can be used to generate a valid schema that allows a field to either use "keyed by" or not. For example:
from taskgraph.util.schema import Schema, optionally_keyed_by
schema = Schema({
# ...
Optional("worker"): {
Optional("max-run-time"): optionally_keyed_by("platform", int),
},
})
transforms.add_validate(schema)
The example above allows both of the following task definitions:
taskA:
worker:
max-run-time: 3600
taskB:
worker:
max-run-time:
by-platform:
windows: 7200
default: 3600
If there are more than one fields that another field may be keyed by, it can be specified like this:
Optional("max-run-time"): optionally_keyed_by("platform", "build-type", int)
In this example either by-platform
or by-build-type
may be used. You
may specify fields as you like this way, as long as the last argument to
:func:`~taskgraph.util.schema.optionally_keyed_by` is the type of the field
after resolving is finished (or if keyed by is unused).