Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: parameterised headers rest_api_source #2084

Open
wants to merge 14 commits into
base: devel
Choose a base branch
from

Conversation

ArneDePeuter
Copy link

@ArneDePeuter ArneDePeuter commented Nov 21, 2024

Description

This pull request addresses the feature request discussed in issue #2071. It introduces dynamic parameter resolution for headers, allowing both keys and values to be resolved dynamically.

Key Features

  • Dynamic Header Resolution: Supports resolving both header keys and values at runtime using specified parameters.
  • Header params can also be bound to provided params.
  • You can still define headers in the client, these will always be used. Endpoint specific parameters extend this dict.

Usage

The feature is demonstrated with the following example:

{
    "name": "chicken",
    "endpoint": {
        "path": "chicken",
        "headers": {"{token}": "{token}", "num": "2"},
        "params": {
            "token": {
                "type": "resolve",
                "field": "token",
                "resource": "authenticate",
            },
        },
    },
}

Tests

Comprehensive tests have been added to validate the implementation and ensure its correctness.

Copy link

netlify bot commented Nov 21, 2024

Deploy Preview for dlt-hub-docs canceled.

Name Link
🔨 Latest commit b3d53e8
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/6752cca042d6990008a5667a

@ArneDePeuter
Copy link
Author

Excuse me for the failed checks, I'll have a look at it!

@ArneDePeuter
Copy link
Author

All checks should pass now, also added an extra test!

@burnash burnash self-assigned this Nov 24, 2024
@burnash
Copy link
Collaborator

burnash commented Nov 24, 2024

Hey @ArneDePeuter thanks for the PR and for adding tests! I'll review it soon.

"headers": {
"{token}": "{token}",
"num": "2",
"nested": {"nested": "{token}", "{token}": "other"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArneDePeuter could you elaborate on the use case the nested headers?

Copy link
Author

@ArneDePeuter ArneDePeuter Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this for the sake of supporting every use case. After some research tough... headers are mostly flat key value pairs. So I also now question wether this is needed. 😬 I'll remove this functionality!

@ArneDePeuter
Copy link
Author

@burnash I removed the recursive parameter resolution for headers, as this was not useful.

@ArneDePeuter
Copy link
Author

ArneDePeuter commented Nov 25, 2024

I now have following question. In: dlt/sources/rest_api/__init__.py:366 would it be useful that resolved params gets extended with the incremental params? This way incrementals can be in the url as in the headers.

Implemented as this for example:

# dlt/sources/rest_api/__init__.py:366 
if incremental_object:
  params = _set_incremental_params(
      params,
      incremental_object,
      incremental_param,
      incremental_cursor_transform,
  )
  resolved_params.extend(params)

for item in items:
  formatted_path, formatted_headers, parent_record = process_parent_data_item(
      path, item, resolved_params, include_from_parent, headers
  )

Secondly, I find it confusing that parameter definitions are intertwined with query params. Is this something that has been brought up before?

@burnash
Copy link
Collaborator

burnash commented Nov 27, 2024

@burnash I removed the recursive parameter resolution for headers, as this was not useful.

Thanks for the update, @ArneDePeuter

I now have following question. In: dlt/sources/rest_api/init.py:366 would it be useful that resolved params gets extended with the incremental params? This way incrementals can be in the url as in the headers.

I don't see a need to implement incremental in headers. I haven't seen a use case for incremental parametes in headers yet. If such a case exists, I would suggest implementing a dlt source using RESTClient rather than the rest_api source.

Secondly, I find it confusing that parameter definitions are intertwined with query params. Is this something that has been brought up before?

Good point. I have already received this feedback from someone. The way the rest_api source defines parameters is loosely based on how parameters are defined in the OpenAPI specification. I don't find the current implementation too confusing, as the end user has control over how path parameters are named and resolved. They are also excluded from the query parameters. However, I'm open to suggestions on how to improve this.

If we extend params to include header params (as in this PR), it might become confusing. I was thinking about different ways to address this. I was considering different ways to address this. One option is to take inspiration from the suggested configuration for query string parameters in #1978 and define the location of the parameter in a location key. For example, from your scenario:

"endpoint": {
    "path": "issues/comments",
    "headers": {"token": "{token}", "static_param":  region}
    "params": {
        "token": {
            "type": "resolve",
            "resource": "authenticate",
            "field": "token",
            "location": "header"
        }
    },
},

What do you think? Would you be willing to update this PR to handle the location key?

@ArneDePeuter
Copy link
Author

ArneDePeuter commented Nov 27, 2024

Thanks for your response!
Ill have a look at it in the upcoming days.

@ArneDePeuter
Copy link
Author

ArneDePeuter commented Nov 28, 2024

@burnash I’ve added the location parameter as requested. By default, its value is set to 'path', ensuring backward compatibility. Hope you like it! The only supported locations currently are 'path' and 'header'.

Personally I think that this feature adds redundant information, I liked the fact that a param was not bound to a specific location. Params are unique by name so if users would want to make it location based they could use name conventions such as 'header_<paraname>'. I feel like this location field adds a layer of complexity, I would like to hear your opinion on this.

@burnash
Copy link
Collaborator

burnash commented Jan 6, 2025

Personally I think that this feature adds redundant information, I liked the fact that a param was not bound to a specific location. Params are unique by name so if users would want to make it location based they could use name conventions such as 'header_<paraname>'. I feel like this location field adds a layer of complexity, I would like to hear your opinion on this.

@ArneDePeuter, sorry for the delayed response. I thought it through and I agree with your point: the location field does not really fix the problem while adding complexity. However, I still don't think putting everything in params is the ideal solution either.
I have a different idea on how to fix this, here's the proposal #2190
It should make the interpolation consistent across path, query and header params.
Let me know what you think! Sorry for giving you a wrong lead in the first place with the location field.
Once the proposal is finalized, I'll implement it first. After that, we'll see what is needed in this PR to merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants