Open
Description
We’re encountering issues with Databricks SQL when attempting to update a row that contains a large JSON array field. Originally, we tried inserting the entire JSON directly into the column, but this failed due to request size limitations.
To address that, we redesigned the approach to:
- Split the full JSON into smaller chunks (~100 items)
- Append each chunk incrementally to the same row/column using separate UPDATE statements
- Commit each chunk using a new thread and session to ensure SQLAlchemy thread safety
Despite chunking, the request eventually fails when the dmt_data field grows large enough (presumably ~1–2MB compressed). The SQL API returns:
pgsqlCopyEdit(databricks.sql.exc.RequestError) Error during request to server.
ExecuteStatement command can only be retried for codes 429 and 503
This confirms that each UPDATE's request body is still exceeding Databricks SQL's internal limits, even though we’re only appending small pieces.
What We’re Looking For:
- Confirmation of the exact request body size limit for INSERT/UPDATE operations over Databricks SQL
- Recommended practice for incrementally updating a single JSON column that grows over time
Metadata
Metadata
Assignees
Labels
No labels