Proposed curio api #34

hannahhoward · 2024-08-20T05:39:57Z

Proposes an HTTP API to curio that could be used by either Storacha or another hot storage market provider for PDP proofs

Preview

proposes an HTTP API to curio that could be used by either Storacha or another hot storage market provider for PDP proofs

magik6k · 2024-08-20T09:59:34Z

rfc/curio-api-rfc.md

+
+With the additional stipulation that they should probably ONLY accept v2 Piece CID: https://github.com/filecoin-project/FIPs/blob/master/FRCs/frc-0069.md
+
+### PUT /piece/{piece cid v2}


Might require a UCAN header for auth

Also for scalability it would be a good idea to have the main endpoint tell you where to go with the data.

Unfortunately redirecting PUT/POST with 3xx is really wonky (pretty much broken with the Go http.Client), so it might just require some two-endpoint setup.

ah ok. I get it -- our main endpoint is like this too! :)

magik6k · 2024-08-20T10:06:37Z

rfc/curio-api-rfc.md

+
+*TODO: do we need an interim response given this is a chain transaction with a place to fetch the set-id later?*
+
+### GET /proof-sets/{set-id}


Maybe link to some spec which lays out how those proof-sets look like / what they are

Truth be told I'm just betting based off the PDP service contract doc

magik6k · 2024-09-30T09:39:19Z

rfc/curio-api-rfc.md

+There are addition considerations we should consider:
+1. Authorization -- In storacha's network, it's important that the original end user maintain control of authorization for any action performed (including retrieval). We accomplish this through UCANs. We should discuss how we can maintain this without forcing curio to implement a full UCAN authorization process.
+2. Aggregation - storacha's data is at times extremely small (<1mb in certain cases). Our understanding is that economically, it makes more sense to do some light aggregation of data before adding it to the proof set. The proposal below outlines a facility for doing this. While storacha would store pieces as it receives them, we would add them to the proof set in a seperate step, with a root that could optionally be an aggregate of several pieces.
+3. IPNI announcements -- we plan to use IPNI announcements in a specific way with our pieces. Our understanding is the curio IPNI flow is in flux. We can try to integrate your IPNI api or just do it ourselves.


IPNI is pretty much implemented now, with the following schema that coordinates it on the curio side:

-- Table for storing IPNI ads CREATE TABLE ipni ( order_number BIGSERIAL PRIMARY KEY, -- Unique increasing order number ad_cid TEXT NOT NULL, context_id BYTEA NOT NULL, -- abi.PieceInfo in Curio -- metadata column in not required as Curio only supports one type of metadata(HTTP) is_rm BOOLEAN NOT NULL, previous TEXT, -- previous ad will only be null for first ad in chain provider TEXT NOT NULL, -- peerID from libp2p, this is main identifier on IPNI side addresses TEXT NOT NULL, -- HTTP retrieval server addresses signature BYTEA NOT NULL, entries TEXT NOT NULL, -- CID of first link in entry chain unique (ad_cid) ); CREATE TABLE ipni_head ( provider TEXT NOT NULL PRIMARY KEY, -- PeerID from libp2p, this is the main identifier head TEXT NOT NULL, -- ad_cid from the ipni table, representing the head of the ad chain FOREIGN KEY (head) REFERENCES ipni(ad_cid) ON DELETE RESTRICT -- Prevents deletion if it's referenced ); -- This table stores metadata for ipni ad entry chunks. This metadata is used to reconstruct the original ad entry from -- on-disk .car block headers or from data in the piece index database. CREATE TABLE ipni_chunks ( cid TEXT PRIMARY KEY, -- CID of the chunk piece_cid TEXT NOT NULL, -- Related Piece CID chunk_num INTEGER NOT NULL, -- Chunk number within the piece. Chunk 0 has no "next" link. first_cid TEXT, -- In case of db-based chunks, the CID of the first cid in the chunk start_offset BIGINT, -- In case of .car-based chunks, the offset in the .car file where the chunk starts num_blocks BIGINT NOT NULL, -- Number of blocks in the chunk from_car BOOLEAN NOT NULL, -- Whether the chunk is from a .car file or from the database CHECK ( (from_car = FALSE AND first_cid IS NOT NULL AND start_offset IS NULL) OR (from_car = TRUE AND first_cid IS NULL AND start_offset IS NOT NULL) ), UNIQUE (piece_cid, chunk_num) );

Now, IPNI likes larger ads, so ideall storacha would create aggregate ads for multiple pieces; we can extend ipni_chunk to support reading from storacha-stored pieces (though really technically just the piececid works fine there)

magik6k · 2024-09-30T09:58:08Z

rfc/curio-api-rfc.md

+## Piece Storage
+
+
+### POST /piece


We definitely need to define how authorization works on this endpoint. This can't just be entirely open.

Also should define the lifecycle of the uploaded data somehow:

How long is it expected to stick around in storage after upload before being included in a proof-set? When should the data be removed if not added to a proof set?

Signalling for expected indexing with IPNI / ipfs-type (trustless gateway/bitswap) retrievals, and who can retrieve the piece?

What is the contract for retrieval - is it retrievable atomically when the notify hook is called? After inclusion in a proof set?

Also what are the size bounds for pieces that you expect curio to support? We can support even very large pieces (100G+), but I don't think a client-push model is a good idea above 1GB, where managing short-term buffers becomes a real concern, and download retry becomes non-optional

feat(pdp): proposed curio api

e3ed7cd

proposes an HTTP API to curio that could be used by either Storacha or another hot storage market provider for PDP proofs

magik6k reviewed Aug 20, 2024

View reviewed changes

refactor(api): refactor api to be more specific

b9abced

magik6k reviewed Sep 30, 2024

View reviewed changes

magik6k mentioned this pull request Sep 30, 2024

[WIP] feat: PDP filecoin-project/curio#227

Draft

27 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed curio api #34

Proposed curio api #34

hannahhoward commented Aug 20, 2024

magik6k Aug 20, 2024

magik6k Aug 20, 2024

hannahhoward Aug 20, 2024

magik6k Aug 20, 2024

hannahhoward Aug 20, 2024

magik6k Sep 30, 2024

magik6k Sep 30, 2024

magik6k Sep 30, 2024


		With the additional stipulation that they should probably ONLY accept v2 Piece CID: https://github.com/filecoin-project/FIPs/blob/master/FRCs/frc-0069.md

		### PUT /piece/{piece cid v2}


		TODO: do we need an interim response given this is a chain transaction with a place to fetch the set-id later?

		### GET /proof-sets/{set-id}

Proposed curio api #34

Are you sure you want to change the base?

Proposed curio api #34

Conversation

hannahhoward commented Aug 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment