Replies: 9 comments 9 replies
-
I tested the I had to update my custom loader because the I opened PR #334 fixing the DSL tests. |
Beta Was this translation helpful? Give feedback.
-
Great, thanks for the feedback! Your test fixes PR is merged into the WiP branch now. I also added a fix for the failing integration tests. I'm going to have a think about about Anchors and Aliases... knew this was feeling too easy so far 😆 Currently the new code won't deserialise custom anchors like this:
So I'll have to think a bit about what shapes anchors can take. Could be we just discard them outright from the PipelineBody and only keep if it's a list-like - current validation that doesn't work for k, v in mapping.items():
if k == 'context_parser':
context_parser = v
continue
# >>> might need to get rid off the following structure checks
if not isinstance(v, Sequence):
raise PipelineDefinitionError(
"step group must be sequence/list.")
else:
if isinstance(v, (str, bytes, bytearray)):
raise PipelineDefinitionError(
"step group must be a list, not a string")
# <<< END of might need to get rid off the following structure checks
step_groups[k] = [Step.from_step_definition(
step_def) for step_def in v] |
Beta Was this translation helpful? Give feedback.
-
About the I didn't add it on #334 because I would suggest using Dataclass is the built-in option, so we would not need to add new dependencies. attrs is more robust and would enable us to add validations as described in #116. What do you think? |
Beta Was this translation helpful? Give feedback.
-
DataClass: pypyr is showing its age... started in 3.6 where DataClass wasn't a thing yet. Now that the minimum supported version is 3.7 it's for sure something to think about it! I'm likely not to want to add to the current churn because: Attrs: awesome lib, but I'm trying to avoid adding dependencies - pypyr gets downloaded a lot as a CI/CD tool, so I try to keep the payload light. Additionally, given how little of the power of attrs we'd actually be using I'd feel bad pulling it in just for that. |
Beta Was this translation helpful? Give feedback.
-
Remaining tests with failures:
|
Beta Was this translation helpful? Give feedback.
-
Re the Reference & Anchor problem: It's not possible to have a pre-determined list of "known" step-group names, because part of the joy of pypyr is that step-group names can dynamically inject at runtime from anywhere - this could be input to the pipeline, or dynamic/unknown result of any given step could be re-used in a A potential solution is to make the following BREAKING changes: It so happens all of the current documented examples use step-group name A potential solution is to make this a rule, so that A mitigation could be, if a step_group named common is found, to attempt to parse Counter-point: I have no idea how many people even know that this reference & anchor feature exists, I haven't seen any usages in the wild. Instead of |
Beta Was this translation helpful? Give feedback.
-
@yaythomas , This comment is a bit "meta"... |
Beta Was this translation helpful? Give feedback.
-
ADR for the change https://github.com/pypyr/pypyr/blob/classify/docs/adr/0006-pipeline-as-code-api.md |
Beta Was this translation helpful? Give feedback.
-
Remaining TODO:
|
Beta Was this translation helpful? Give feedback.
-
This discussion is to get some feedback about a (big!) new feature:
ADR 0006 Pipeline As Code
new pipeline api
create a Python API that models pypyr pipelines. This will allow coders to create their pypyr pipelines directly in python code (rather than in yaml files). It will also help with validating pipelines in advance (without needing to run them 1st #116 & #229)
@lucasrcezimbra has already done some great exploratory work here in draft PR #332! 🙌 🏆
There is a Work in Progress branch that shows what we're up to here: main...classify
See here for an example of how an API consumer would create pipelines with classes: https://github.com/pypyr/pypyr/blob/classify/tests/integration/pypyr/DELETE_ME_wip_pipeline_as_api_test.py
This code is pretty much backwards compatible - current API consumers should NOT notice a difference, other than that malformed pipelines will fail sooner than they used to (some validation errors will now happen when parsing the pipeline, rather than when running it).
current status
This code runs (
although I haven't updated any unit tests, so I'd expect a whole LOT of unit test failuresunit test all passing, but still missing test coverage for newly introduced code), but I've tested the principle and it runs pipelines and I think it's probably about right in terms of the actual end-to-end functionality. If we do decide to go ahead with this, next steps would bea)
check if all failing tests are failing for the right reasons, and not because we overlooked some logicb)
updating all the failing tests so they're working againc)
general clean-up - I worked in a hurry, so this is more along the lines of a Proof-of-Concept than finessed and tidy code 😅feedback
None of this is final as of yet, so if there is any community feedback, concerns, wishlist items.... now is the time to get involved!
please use this discussion thread for the new feature.
I'm still thinking through whether the
PipelineBody
class and its helpers likecreate_steps_group
, and how thepipelinerunner
uses them with the newrun_pipeline_body
entry point isa) easy enough and
b) future proof,
so this is by no means necessarily final, and I'm very interested to hearing your feedback!
Provisional Release Notes
[These provisional notes are a summary of the thoughts/progress in the rest of this discussion thread, and I'll keep on updating it as the current latest/greatest ideas refine]
new features
create pipelines in code
In the new major version of pypyr you will be able to create pipelines in Python code rather than in yaml:
immediate validation on load
Pipeline validation now happens when a pipeline first loads, rather than only later during the execution phase. pypyr used to validate the structure of each step just-in-time as it was running that step. In the new version the validation happens when the pipeline loads (before it runs). This means you do not have to wait until a long-running pipeline has progressed until a failure point to validate that your pipeline structure is correct.
Note that this validation is for over-all pipeline structure. Validation of an individual step's
in
arguments still happens on step execution, and not when the pipeline first loads.breaking changes
The new functionality does come at a cost though... there will be 2 unavoidable BREAKING CHANGES in the new version:
PipelineDefinition in custom loader
If you have a custom pipeline loader that returns a
PipelinelineDefinition
,the
pipeline
property now is aPipelineBody
rather than aMapping
.If your custom loader returns a
Mapping
it will keep on working as before, you do NOT need to change anything:If however, you are returning a
PipelineDefinition
from your custom loader, you will need to migrate to the new way of doing things:won't work anymore
new
Amend your
PipelineDefinition.pipeline
to take aPipelineBody
rather than aMapping
:Note that you also now have the option of building your pipeline in code, rather than loading it from yaml.
No more arbitrary yaml in pipelines
If you have any arbitrary yaml in your pipelines, this now HAS to be under the
_meta
key.won't work anymore
Previously, you could have arbitrary custom yaml sections that were NOT valid pypyr step-groups:
new
In the new version, you have to move arbitrary yaml under a
_meta
key:Beta Was this translation helpful? Give feedback.
All reactions