Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ploomber.tasks.Link is not usable #1148

Open
marr75 opened this issue Dec 3, 2023 · 2 comments
Open

ploomber.tasks.Link is not usable #1148

marr75 opened this issue Dec 3, 2023 · 2 comments

Comments

@marr75
Copy link
Contributor

marr75 commented Dec 3, 2023

With a source attribute, ploomber.tasks.Link cannot be instantiated. Without a source, the task fails validation.

It is not currently possible to use ploomber.tasks.Link in a pipeline spec.

@edublancas
Copy link
Contributor

I think it's because when we addded Link, we only had the Python API (not the pipeline.yaml API), and we never worked on ensuring it'd work with pipeline.yaml. Feel free to open a PR!

@marr75
Copy link
Contributor Author

marr75 commented Dec 5, 2023

@edublancas I will. I could use a little guidance from you, though.

Locally, I've got this signature for Link:

class Link(Task):
    ...
    def __init__(self, source, product, dag, name=None):
        kwargs = dict(hot_reload=dag._params.hot_reload)
        self._source = type(self)._init_source(kwargs)
        super().__init__(product, dag, name, None)

And tasks using Link tend to look like:

  # Dummy task to wrap success stories exported from hubspot
  - name: success-stories
    source: ""
    product: "{{PRODUCTS_DIR}}/success-stories.csv"
    class: Link
    product_class: File

Which, isn't terrible but the blank source, the class, and the product_class could all be a little confusing.

I don't think I can get around the source issue without quite a bit of rewiring in the spec task validation (which strictly looks for source without OO/protocol based validation). The product_class issue may be solvable by trying to validate whether product is a pathlike or url-like.

I suppose I could make any string that matches source.lower() == "link" get a class of Link. Maybe that kills two birds with one stone?

Let me know your thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants