Skip to content

RestAPI incorrectly reports resource runtime for non-paginated requests #3518

@FridayPush

Description

@FridayPush

dlt version

1.18.2

Describe the problem

When using the RestAPI source to pull API resources that are singular requests that take a long time for TTFB, time to first byte, the timing report does not accurately reflect the response. In the following logs the request takes ~6 minutes to get the response and returns a large ~100mb json array with 34k records in it.

------------------------------- Extract rest_api -------------------------------
Resources: 0/1 (0.0%) | Time: 0.00s | Rate: 0.00/s

------------------------------- Extract rest_api -------------------------------
Resources: 0/1 (0.0%) | Time: 423.05s | Rate: 0.00/s
my_slow_api_resource: 34727  | Time: 0.00s | Rate: 2141994044.24/s

------------------------------- Extract rest_api -------------------------------
Resources: 1/1 (100.0%) | Time: 423.06s | Rate: 0.00/s
my_slow_api_resource: 34727  | Time: 0.01s | Rate: 2406218.01/s

This log is from a configured info level, and every 20s. Given this is an isolated run it's easy to see, but generally this resource is in a batch of 12 and the fact it takes so long is not obvious in the group as all of this vendors endpoints are not paginated. So each resource says it takes <1s.

Expected behavior

The resource should record the start of it's timing before it makes a single request, not from the end of the first request.

Steps to reproduce

Create a flask endpoint that sleeps 30s before returning a response but maintains the connection. Observe DLT log timing recorded <1s for the resource.

Operating system

macOS

Runtime environment

Virtual Machine

Python version

3.11

dlt data source

RestAPI Source

dlt destination

DuckDB

Other deployment details

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions