-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed curio api #34
base: main
Are you sure you want to change the base?
Conversation
proposes an HTTP API to curio that could be used by either Storacha or another hot storage market provider for PDP proofs
rfc/curio-api-rfc.md
Outdated
|
||
With the additional stipulation that they should probably ONLY accept v2 Piece CID: https://github.com/filecoin-project/FIPs/blob/master/FRCs/frc-0069.md | ||
|
||
### PUT /piece/{piece cid v2} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might require a UCAN header for auth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for scalability it would be a good idea to have the main endpoint tell you where to go with the data.
Unfortunately redirecting PUT/POST with 3xx is really wonky (pretty much broken with the Go http.Client), so it might just require some two-endpoint setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok. I get it -- our main endpoint is like this too! :)
|
||
*TODO: do we need an interim response given this is a chain transaction with a place to fetch the set-id later?* | ||
|
||
### GET /proof-sets/{set-id} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe link to some spec which lays out how those proof-sets look like / what they are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truth be told I'm just betting based off the PDP service contract doc
There are addition considerations we should consider: | ||
1. Authorization -- In storacha's network, it's important that the original end user maintain control of authorization for any action performed (including retrieval). We accomplish this through UCANs. We should discuss how we can maintain this without forcing curio to implement a full UCAN authorization process. | ||
2. Aggregation - storacha's data is at times extremely small (<1mb in certain cases). Our understanding is that economically, it makes more sense to do some light aggregation of data before adding it to the proof set. The proposal below outlines a facility for doing this. While storacha would store pieces as it receives them, we would add them to the proof set in a seperate step, with a root that could optionally be an aggregate of several pieces. | ||
3. IPNI announcements -- we plan to use IPNI announcements in a specific way with our pieces. Our understanding is the curio IPNI flow is in flux. We can try to integrate your IPNI api or just do it ourselves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IPNI is pretty much implemented now, with the following schema that coordinates it on the curio side:
-- Table for storing IPNI ads
CREATE TABLE ipni (
order_number BIGSERIAL PRIMARY KEY, -- Unique increasing order number
ad_cid TEXT NOT NULL,
context_id BYTEA NOT NULL, -- abi.PieceInfo in Curio
-- metadata column in not required as Curio only supports one type of metadata(HTTP)
is_rm BOOLEAN NOT NULL,
previous TEXT, -- previous ad will only be null for first ad in chain
provider TEXT NOT NULL, -- peerID from libp2p, this is main identifier on IPNI side
addresses TEXT NOT NULL, -- HTTP retrieval server addresses
signature BYTEA NOT NULL,
entries TEXT NOT NULL, -- CID of first link in entry chain
unique (ad_cid)
);
CREATE TABLE ipni_head (
provider TEXT NOT NULL PRIMARY KEY, -- PeerID from libp2p, this is the main identifier
head TEXT NOT NULL, -- ad_cid from the ipni table, representing the head of the ad chain
FOREIGN KEY (head) REFERENCES ipni(ad_cid) ON DELETE RESTRICT -- Prevents deletion if it's referenced
);
-- This table stores metadata for ipni ad entry chunks. This metadata is used to reconstruct the original ad entry from
-- on-disk .car block headers or from data in the piece index database.
CREATE TABLE ipni_chunks (
cid TEXT PRIMARY KEY, -- CID of the chunk
piece_cid TEXT NOT NULL, -- Related Piece CID
chunk_num INTEGER NOT NULL, -- Chunk number within the piece. Chunk 0 has no "next" link.
first_cid TEXT, -- In case of db-based chunks, the CID of the first cid in the chunk
start_offset BIGINT, -- In case of .car-based chunks, the offset in the .car file where the chunk starts
num_blocks BIGINT NOT NULL, -- Number of blocks in the chunk
from_car BOOLEAN NOT NULL, -- Whether the chunk is from a .car file or from the database
CHECK (
(from_car = FALSE AND first_cid IS NOT NULL AND start_offset IS NULL) OR
(from_car = TRUE AND first_cid IS NULL AND start_offset IS NOT NULL)
),
UNIQUE (piece_cid, chunk_num)
);
Now, IPNI likes larger ads, so ideall storacha would create aggregate ads for multiple pieces; we can extend ipni_chunk
to support reading from storacha-stored pieces (though really technically just the piececid works fine there)
## Piece Storage | ||
|
||
|
||
### POST /piece |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need to define how authorization works on this endpoint. This can't just be entirely open.
Also should define the lifecycle of the uploaded data somehow:
- How long is it expected to stick around in storage after upload before being included in a proof-set? When should the data be removed if not added to a proof set?
- Signalling for expected indexing with IPNI / ipfs-type (trustless gateway/bitswap) retrievals, and who can retrieve the piece?
- What is the contract for retrieval - is it retrievable atomically when the notify hook is called? After inclusion in a proof set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also what are the size bounds for pieces that you expect curio to support? We can support even very large pieces (100G+), but I don't think a client-push model is a good idea above 1GB, where managing short-term buffers becomes a real concern, and download retry becomes non-optional
Proposes an HTTP API to curio that could be used by either Storacha or another hot storage market provider for PDP proofs
Preview