Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create group-by-space-provider.md #2

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Gozala
Copy link
Contributor

@Gozala Gozala commented Jul 21, 2023

https://filecoinproject.slack.com/archives/C02BZPRS9HP/p1689950046118329?thread_ts=1689873032.504899&cid=C02BZPRS9HP

building a bit on top of this, and thinking about the flow of producers send us pieces, looks like this would fit as part of the w3-aggregation spec as a UCAN capability.
I think later on (after migration) it can be done as an effect of `store/add` together with `piece/build` (or similar).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think capturing that in the effect is a cool ideal, as it will appear in the receipt and would allow recipient to look it up.

We should also consider how clients could submit piece CIDs to us. Perhaps we could have piece/add that users could invoke and have piece/verify instead of piece/build that way if we already have a car and piece we could avoid that effect.

In the future we could also use piece/add in place of S3 event to kick things off, and if user never did piece/add then it does not ends up in filecoin and that on them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so piece/add followed by piece/verify that would trigger the pipeline if verified. Still need to think more on how we would skip verify in some steps. Maybe invocation is just triggered and the handler can decide to skip and state reasoning in receipt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if it already has verified, it can be considered noop. If for some reason it chooses to trust and not verify that's ok too, in that case we verify that we request is from trusted actor.


building a bit on top of this, and thinking about the flow of producers send us pieces, looks like this would fit as part of the w3-aggregation spec as a UCAN capability.
I think later on (after migration) it can be done as an effect of `store/add` together with `piece/build` (or similar).
So here, I was thinking about having a `piece/offer` invocation where w3up after building the piece CID, as well as CF hosted APIs would invoke `piece/offer` that wouldbe handled by w3filecoin either responding queued or done if actually we already have it in an aggregate.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good

I think later on (after migration) it can be done as an effect of `store/add` together with `piece/build` (or similar).
So here, I was thinking about having a `piece/offer` invocation where w3up after building the piece CID, as well as CF hosted APIs would invoke `piece/offer` that wouldbe handled by w3filecoin either responding queued or done if actually we already have it in an aggregate.
What I still fail to capture is how would we tie this with a provider to set message group id. Probably this needs to be a `nb` field? I guess on older CF hosted APIs, it would just be hardcoded with the free provider for each product.
w3up could be more intelligent and check which space `store/add` used for that CAR cid and then get registered provider for the space.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that users should interact with a system by tenant DID, meaning invocation aud is a tenant DID like web3.storage or nft.storage. That was also how I was thinking we could avoid having to introduce tags for data stored / uploaded. It also ties nicely with ucan invocation router idea.

However group resolution logic needs bit more though because tenant:capability-provider is 1:n not 1:1 and space:capability-provider is also 1:n.

I am actually not sure how are going to signal which capability-provider should handle the upload when space has multiple ones associated. Perhaps aud should be a capability-provider and not a tenant in which case everything would fit right in. E.g. free.web3.storage and basic.web3.storage could be an aud.

I'm actually bit fuzzy on DB details now worth asking @travis as that's probably in his headspace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked with @travis and captured findings in storacha/w3up#839

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travis could we give us more details on what would you think here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasco-santos the linked issue captures what me and @travis came to agree was a good direction. Also I think @travis may not be seeing this thread in notifications so might be worth messaging him out of bound.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding correctly that the issue here is that we want to do "Filecoin stuff" differently depending on which plan a user has?

From a user perspective I was imagining something like the following:

  1. Alice comes in and signs up and gets a free.web3.storage plan by default, which gives them (eg) 100MB of storage
  2. Alice uploads a 50MB file, Alice has one upload provider, we use it, everything works, Alice is happy
  3. Alice uploads a 60 MB file, the system tells her "sorry, you don't have enough storage in your plan, but you can sign up for a basic plan for $10/mo that "
  4. Alice signs up for a basic plan, retries the upload, it works!

I think the issue is that we would want to get the second upload (the 60MB one) into Filecoin ASAP since it is "covered" by the basic plan, but the first upload might not be as high a priority (or in the future might not get added to Filecoin at all?) since it was uploaded as part of the free plan - is that right or am I misunderstanding?

It seems like this comes down to whether we want to push the complexity of picking a "storage plan" that an upload should be associated with out to the clients, or whether we want to commit to heuristics that we would use on our end to make decisions like the one in the previous paragraph - does that sound right? I could see advantages either way, and I do think we could build our client libraries to make some of those heuristic decisions, but overall I think I prefer trying to hide this complexity from clients and keep the protocol API a little "simpler" for end-users?

That said I do think either could work! The one thing I really want to avoid is forcing w3cli or w3ui users to manually choose a plan for each upload - that feels like unnecessary and surprising friction that could be detrimental to people using and adopting w3up - I think we can avoid that whatever direction we go though, so that's not a hard blocker afaik.

Am I on the mark here or misunderstanding the conversation entirely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travis yes that is the overall discussion here.

Per storacha/w3up#839 , proposal seem to be to have invocations like store/add, the space/allocate invocation would receive the tenant information related to the space where store/add is going to allocate space. This way, looks like option would be to on store/add invocation, the backend will look at the current plan for the provider (based on audience, so that we support space with multiple tenants) and:

  • write in store-table the associated provider and plan
  • return on store/add invocation that information for the receipt

That way, when we receive the CAR file, and piece/add is computed the information exists on our table and we can proceed with the Filecoin piece with the appropriate SLA

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok cool - that makes sense to me! I have a concern I haven't been able to fully articulate about capturing the "plan" in the store-table, but as long as the semantics are "this is the plan that was active/current when the file was uploaded" I think this sounds great.

// Piece CID
"piece": { "/": "commitment...proof" },
// Car CID
"link": { "/": "bag..." },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call this content or a payload instead. I think names should describe what the thing is not the encoding of that thing.

"piece": { "/": "commitment...proof" },
// Car CID
"link": { "/": "bag..." },
"provider": "did:web:free.web3.storage"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be the with field which right now is pretty pointless

}
```

or perhaps we want to clearly create a w3filecoin key pair, and all the aggregation spec should not be as storefront, but as the aggregator (as we actually already talked)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make more sense to me, making those independent roles gives us more flexibility, we can always choose to make same actor play both roles.

flowchart LR

user --store/add--> storefront
storefront --piece/build--> storefront
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. We could implement capability on both CF and AWS side and then route with ucanto to the one that has content nearby.

Comment on lines +42 to +43
storefront --piece/build--> storefront
storefront --piece/offer--> aggregator
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advice against same piece/ prefix across actors, as it gets very confusing quickly. Generally I suggest to to think of prefix as an objects with methods (build, offer).

Object can have a single owned which is either storefront or an aggregator, but it can not be both (kind of like Rust ownership model, but also true in actor model etc…).

It also has other implication like when I delegate you a piece/* I essentially gave you ref to that object, but if part of the object is owned by one actor and the other part it's confusing and not always possible.

Perhaps it could be something like

Suggested change
storefront --piece/build--> storefront
storefront --piece/offer--> aggregator
storefront --piece/add--> storefront
storefront --segment/add--> aggregator

storefront --piece/build--> storefront
storefront --piece/offer--> aggregator
aggregator --aggregate/offer--> authority
authority --offer/arrange--> storefront
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually thinking the same before

flowchart LR

user --store/list--> storefront
storefront --piece/status--> aggregator
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like content → piece → aggregate mapping should live in the storefront so it does not have to go through the aggregator to figure out the deal status.

So in fact perhaps instead of writing to dynamo as we were going to do when deal goes through, we should instead let storefront know, which in turn can go and write down mapping it needs to be able to answer those questions. That way aggregator / w3filecoin just takes care of aggregation and deal arrangement after that it's done.

I think later on (after migration) it can be done as an effect of `store/add` together with `piece/build` (or similar).
So here, I was thinking about having a `piece/offer` invocation where w3up after building the piece CID, as well as CF hosted APIs would invoke `piece/offer` that wouldbe handled by w3filecoin either responding queued or done if actually we already have it in an aggregate.
What I still fail to capture is how would we tie this with a provider to set message group id. Probably this needs to be a `nb` field? I guess on older CF hosted APIs, it would just be hardcoded with the free provider for each product.
w3up could be more intelligent and check which space `store/add` used for that CAR cid and then get registered provider for the space.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked with @travis and captured findings in storacha/w3up#839

https://filecoinproject.slack.com/archives/C02BZPRS9HP/p1689950046118329?thread_ts=1689873032.504899&cid=C02BZPRS9HP

building a bit on top of this, and thinking about the flow of producers send us pieces, looks like this would fit as part of the w3-aggregation spec as a UCAN capability.
I think later on (after migration) it can be done as an effect of `store/add` together with `piece/build` (or similar).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if it already has verified, it can be considered noop. If for some reason it chooses to trust and not verify that's ok too, in that case we verify that we request is from trusted actor.

I think later on (after migration) it can be done as an effect of `store/add` together with `piece/build` (or similar).
So here, I was thinking about having a `piece/offer` invocation where w3up after building the piece CID, as well as CF hosted APIs would invoke `piece/offer` that wouldbe handled by w3filecoin either responding queued or done if actually we already have it in an aggregate.
What I still fail to capture is how would we tie this with a provider to set message group id. Probably this needs to be a `nb` field? I guess on older CF hosted APIs, it would just be hardcoded with the free provider for each product.
w3up could be more intelligent and check which space `store/add` used for that CAR cid and then get registered provider for the space.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasco-santos the linked issue captures what me and @travis came to agree was a good direction. Also I think @travis may not be seeing this thread in notifications so might be worth messaging him out of bound.

Gozala added a commit to storacha/specs that referenced this pull request Aug 9, 2023
…es (#71)

Based on my previous proposal to @Gozala that @Gozala covered in
storacha/RFC#2

Changes the aggregation spec introducing new capabilities and a new
role. Main reasoning for this change is the new distinction between
Storefront and Aggregator (previously had same identity), where a
Storefront (web3.storage, nftstorage) will receive/compute Filecoin
pieces (from received CAR files) and offer them to an Aggregator
(w3filecoin). The aggregator's role is to put multiple pieces together
to offer a large piece (aggregate) to a Broker (spade-proxy).

Per the above, the new invocations added are described in
https://www.notion.so/2023-07-26-aggregation-protocol-contract-sign-and-report-API-12b4a7f0e92f4b789c331322e2629817

Renamed spec from aggregation to w3 filecoin given the context increased
from aggregation to accommodate new requirements

Note that `list`, will be added later on

---------

Co-authored-by: Irakli Gozalishvili <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants