-
Notifications
You must be signed in to change notification settings - Fork 65
refactor common sources to work with dlt+ #624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
refactor common sources to work with dlt+ #624
Conversation
8bfdf17
to
e50a666
Compare
@@ -28,7 +29,7 @@ | |||
|
|||
|
|||
@dlt.source | |||
def asana_source() -> Any: # should be Sequence[DltResource]: | |||
def asana_source(access_token: str = dlt.secrets.value) -> Sequence[DltResource]: | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can just add the access token here, even though its not used, so that it shows up in the config
or remove it from the resources too and pass it down? but that seems less elegant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is not used, why add it? it will confuse users that do not look into the code and set a value there. Why do you want to add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every resource needs this secret, so if they dont find it they wont run.
but as the code derives the secrets and configs from the source that gets linked (the method above) it won't pick it up otherwise. (or is that a bug? )
marcin mentioned the clean way to write sources would be to give the secret to the source and pass it to each resrouce, but I think in this example it's actually cleaner as it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you should forward the token to the resources by instantiating them. Just putting it there without consuming it is an antipattern and I don't think it will work if users create named sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you like the idea of moving all resource definitions into the source method?
That was Alona's idea, who said we want to encourage users to group resources anyways...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that way access_token can just be a property of the source and all methods inside have access to it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for changing this. I think with the newest changes from @rudolfix on the core devel branch, inner resources now also pick up configuration values automatically, but let's keep it like this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I have some small quesionts
@@ -28,7 +29,7 @@ | |||
|
|||
|
|||
@dlt.source | |||
def asana_source() -> Any: # should be Sequence[DltResource]: | |||
def asana_source(access_token: str = dlt.secrets.value) -> Sequence[DltResource]: | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is not used, why add it? it will confuse users that do not look into the code and set a value there. Why do you want to add it?
from dlt.sources import DltResource | ||
|
||
from .helpers import get_stats_with_retry, parse_response | ||
|
||
|
||
@dlt.source(name="bing_webmaster") | ||
def source( | ||
site_urls: List[str] = None, site_url_pages: Iterable[DictStrStr] = None | ||
site_urls: List[str] = dlt.config.value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should be "optional", no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. If they were not present, what page would you want to get the traffic statistics for then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you are right. But we have to update the docstring of this source then, it makes it seem as though they are optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good, but there is at least one test you need to adapt, please check the output. Can you also talk to @anuunchin to figure out which test failures are expected because we do not have credentials and which ones you can't ignore?
Also did you test these changes out with dlt+ to see wether all verified sources can be at least created?
6d9da63
to
40e0f16
Compare
on the failing tests: |
Related Issues
note for reviever
assuming you have checked out verified-sources at the same height as dlt-plus you can do this in the playground folder to test them
then check
dlt.yml
and.dlt/secrets.toml
for changes