API to create or update records in bulk #2844

seancolsen · 2023-04-25T19:28:16Z

Limitations of the current API

Creating and modifying records requires one request per record

Use cases

Pasting data into cells requires modifying existing records and also (according to the spec) creating new records too (when the pasted data extends beyond the viewable records).
We may eventually want to implement other features like setting multiple cell values to null, or doing something akin to the "fill" feature that spreadsheets offer. This would require modifying records in bulk.

Similar work

Add bulk delete endpoint #2641 added a new endpoint to delete records in bulk
Add existing_import API #1442 added a a new endpoint to create records in bulk

In Matrix @silentninja said:

I don't think you can use the existing_import API for pasting the data into cells because the existing_import API creates a temporary Table from a datafile and then copies the data from the temporary Table to the Table the data has to be imported into, meanwhile the case of pasting data into cells is a much simpler task. So creating a new temporary Table and copying data from that temporary table might be an overkill for the copy-paste feature

A "Paste" example

User copies these eight cells:
User pastes the copied data into the sheet at the following starting position:
The front end uses the API to:
- Update records 249 and 250, modifying First Name and Last Name.
- Create two new records by supplying values for First Name and Last Name.
The API responds with the full records for all the records modified and created, allowing the front end to display the new data.

Thoughts on API design

I want to leave this ticket somewhat open-ended to start, in case the backend team would like to propose an API design.
It would be nice to perform these changes within a single transaction. From a UX perspective, if something goes wrong, no changes should be made. That would mean creating one API to handle both updating records and creating records.
How does the API behave when we tell it to update a record which no longer exists?

Pretend we're following the "Paste" example above but in between steps 1. and 2. another user deletes record 249.
- One possibility would be to create a new record with id 249.
- Another possibility would be to fail. I think I prefer this behavior.
For the copy-paste use case at least, we can rely on our our data being rectangular. That is, we don't need to update one field in record 249 and a different field in record 250 — we need to update all the same fields in all records. I think it's fine to bake this assumption into the API, especially since I predict it may be necessary for a performant implementation. For example, we'd would want to avoid a naive implementation which makes one UPDATE query per row.

Assuming we do bake this "rectangular" assumption into the API behavior, it may be worth considering the ways in which the API design convey that behavior. For example, we might be inclined to design the request body like this:
```
[
  { "1": 249, "2": null,     "3": "Lloyd"     },
  { "1": 250, "2": "Shelby", "3": "Shepherd"  },
  {           "2": "Robert", "3": "Mccormick" },
  {           "2": "Dawn",   "3": null        }
]
```
But with that design, an API consumer could submit a rather strange request to update two records like this:
```
[
  { "1": 249,               "3": "Lloyd" },
  { "1": 250, "2": "Shelby"              }
]
```
Because of the potential for requests like the the above, we'd need to clearly communicate the behavior of the API under these circumstances. For record 249, column 2, would the API leave the value untouched? Or would it set the value to NULL? Or maybe the request would fail?

Just as food for brainstorming, a request schema like this might go further towards communicating this "rectangular" requirement to API consumers. I don't know.
```
{
  "columns": [1, 2, 3],
  "data": [
    [ 249, null,     "Lloyd"      ],
    [ 250, "Shelby", "Shepherd"   ],
    [ null, "Robert", "Mccormick" ],
    [ null, "Dawn",   null        ]
  ]
}
```
Unlike our proper records endpoint, the front end will not have any good way to display cell-level errors for this paste feature, or for the other features I'm imagining we'd build with this API. For that reason, the requirements around the error response are more loose. When the request fails due to some data problem (e.g. constraint violation, etc) the front end will display a toast message. The API should provide enough info in the error message to help the user fix their problem, but it certainly doesn't need to list all the error, or even be super specific about which cell caused the error.

The text was updated successfully, but these errors were encountered:

seancolsen · 2023-04-25T19:39:36Z

@silentninja I need this API for the "paste" feature that I'm implementing as part of the Usability Improvements project, for which you are a helper. Would you be able to help me out with this? We first need to agree on the API design. Would you be able to propose a design which specifies the endpoint, HTTP verb, request schema, and response schema? I'm not sure who else should be involved in approving the API design, but it might be worth soliciting feedback more broadly once we have a clearer proposal. After we have an approved design, I'll also need a backend engineer to take on the implementation. Might that be you as well?

silentninja · 2023-04-27T14:34:58Z

"past" feature

Is this meant to be "paste feature"?

I can work on the API design and the implementation. I will start working on this once I finish up tasks related v0.1.2 milestone.

seancolsen · 2023-04-27T14:36:44Z

Is this meant to be "paste feature"?

Yes. Thank you. Fixed.

I can work on the API design and the implementation.

Excellent. Thanks!

silentninja · 2023-06-01T15:50:39Z

Because of the potential for requests like the the above, we'd need to clearly communicate the behavior of the API under these circumstances. For record 249, column 2, would the API leave the value untouched? Or would it set the value to NULL? Or maybe the request would fail?

The request will FAIL. Being strict with the request body is much easier and safer than assuming that the user knows about the API behavior.

How does the API behave when we tell it to update a record which no longer exists?

Pretend we're following the "Paste" example above but in between steps 1. and 2. another user deletes record 249.

One possibility would be to create a new record with id 249.
Another possibility would be to fail. I think I prefer this behavior.

The "Failing Behaviour" is hard to implement in SQL. The "Record Creation" behavior is quite easy to implement. The steps for the simple implementation of the UPSERT operation would look like

Insert the set of values along with the id field
If a row exists with the correct id value, update the other values for that row
If a row does not exist, set the id field value to the next SEQUENCE and create a new record.

This could be accomplished easily using the INSERT ... ON CONFLICT Postgres statement. I prefer to go with this implementation.

In order to accomplish the "Failing Behaviour" we would have to

Find the row with the id value and update the values if it exists
If the row does not exist and if the id value is given. Throw an error
If the row does not exist and if the id value is not given. set the id field value to the next SEQUENCE and Create a new record.

Postgres does not natively support this implementation. So we would have to come up with a very complicated logic that is prone to concurrency issues mentioned in the various threads in the POSTGRES DOCS. I would prefer to avoid going in that direction.

seancolsen · 2023-06-26T20:26:59Z

Your plan looks good to me, @silentninja

seancolsen · 2023-11-01T16:45:31Z

Moving to "High Priority" milestone since this work is 80% complete but Dom will no longer be working on it. I'd like to finish this up soon so that we can make use of this existing work before accumulating too many merge conflicts.

seancolsen · 2023-11-01T16:57:27Z

@mathemancer I'm assigning this to you so that we can get this finished up soon. It can wait until after the 0.1.4 release.

seancolsen added status: draft type: enhancement New feature or request work: backend Related to Python, Django, and simple SQL restricted: maintainers Only maintainers can resolve this issue labels Apr 25, 2023

seancolsen added this to the Backlog milestone Apr 25, 2023

seancolsen assigned silentninja May 24, 2023

silentninja assigned seancolsen and unassigned silentninja Jun 26, 2023

silentninja assigned silentninja and unassigned seancolsen Jun 26, 2023

silentninja added ready Ready for implementation and removed status: draft labels Jun 26, 2023

seancolsen mentioned this issue Jul 6, 2023

Frontend support for pasting from the system clipboard into Mathesar cells #2812

Open

seancolsen mentioned this issue Aug 11, 2023

Implement paste into cells #3143

Open

dmos62 mentioned this issue Sep 7, 2023

2023-08 Backend fixes #3144

Closed

dmos62 assigned dmos62 and unassigned silentninja Sep 19, 2023

dmos62 added status: started and removed ready Ready for implementation labels Sep 19, 2023

seancolsen linked a pull request Oct 25, 2023 that will close this issue

Paste API #3222

Draft

7 tasks

seancolsen unassigned dmos62 Nov 1, 2023

seancolsen modified the milestones: Backlog, High priority Nov 1, 2023

seancolsen assigned mathemancer Nov 1, 2023

seancolsen added ready Ready for implementation and removed status: started labels Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API to create or update records in bulk #2844

API to create or update records in bulk #2844

seancolsen commented Apr 25, 2023

seancolsen commented Apr 25, 2023 •

edited

silentninja commented Apr 27, 2023 •

edited

seancolsen commented Apr 27, 2023

silentninja commented Jun 1, 2023 •

edited

seancolsen commented Jun 26, 2023

seancolsen commented Nov 1, 2023 •

edited

seancolsen commented Nov 1, 2023

API to create or update records in bulk #2844

API to create or update records in bulk #2844

Comments

seancolsen commented Apr 25, 2023

Limitations of the current API

Use cases

Similar work

A "Paste" example

Thoughts on API design

seancolsen commented Apr 25, 2023 • edited

silentninja commented Apr 27, 2023 • edited

seancolsen commented Apr 27, 2023

silentninja commented Jun 1, 2023 • edited

seancolsen commented Jun 26, 2023

seancolsen commented Nov 1, 2023 • edited

seancolsen commented Nov 1, 2023

seancolsen commented Apr 25, 2023 •

edited

silentninja commented Apr 27, 2023 •

edited

silentninja commented Jun 1, 2023 •

edited

seancolsen commented Nov 1, 2023 •

edited