Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queries stuck in FINISHING time #463

Open
1 task
KerenMousseri opened this issue Jun 2, 2024 · 3 comments
Open
1 task

Queries stuck in FINISHING time #463

KerenMousseri opened this issue Jun 2, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@KerenMousseri
Copy link

KerenMousseri commented Jun 2, 2024

Expected behavior

Queries' results should be successfully recieved to the client.

Actual behavior

In our Trino cluster, we are facing an issur that some queries remain stuck in the FINISHING state for an extended period before eventually failing with the error message: "Query was abandoned by the client, as it may have exited or stopped checking for query results."

After conducting some investigation, it appears that this issue predominantly occurs when querying Trino using the Python client. Here's a breakdown of the observed flow:

  1. In the main module, we execute the TrinoQuery.execute function with our query.
  2. This function initiates a POST request to the Trino coordinator.
  3. Subsequently, it sends a GET request to the nextUri to retrieve the initial batch of query results.
  4. As the results start arriving, the query state transitions to FINISHING.
  5. The execution of the execute function ends.
  6. Following this, the cursor.fetchall() function in the main module iterates over the nextUris, yielding each received row to the client. However, after a certain duration of fetching query results, the query fails with the "query abandon" error (as mentioned above).

Any assistance on resolving this significant issue would be greatly appreciated.

Thank you!!

Steps To Reproduce

  1. Is it advisable to incorporate heartbeats to the coordinator while fetching results?

  2. Would it be feasible to fetch multiple nextUris in parallel? I'm uncertain about this possibility due to the need to access nextUris as a linked list.

Log output

No response

Operating System

Windows

Trino Python client version

0.326.0

Trino Server version

439

Python version

3.9.3

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@hashhar hashhar added the bug Something isn't working label Jun 6, 2024
@hashhar hashhar self-assigned this Jun 6, 2024
@njalan
Copy link

njalan commented Sep 10, 2024

@hashhar Is there any progress on it? We also face the same issue

@hashhar
Copy link
Member

hashhar commented Sep 12, 2024

This is hard to reproduce and unclear if the causes for your case and our reproduction is same.

So we plan to add additional debug logging and then when someone is able to reproduce this issue we can look at the logs to figure out what is going wrong.

Probably here - https://github.com/trinodb/trino-python-client/blob/a87566794d9a9eefdd481a95f001ce2e37e20531/trino/client.py#L846C1-L846C65

@hashhar
Copy link
Member

hashhar commented Dec 5, 2024

Might be related or not - trinodb/trino#22989 (comment)

There's debug logs from the client there + matching server logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

3 participants