-
Notifications
You must be signed in to change notification settings - Fork 103
Support writing to a specific pebble cache partition #8899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@bduffany @jdhollen @tylerwilliams What do you guys think about this approach? I want to support writing snapshots to a special partition in the cacheproxy The cache config might look something like: https://github.com/buildbuddy-io/buildbuddy-internal/compare/ss_partition_ex?expand=1 |
First reaction: disgust. Second reaction: yeah ok, that seems pretty reasonable? I suppose an alternative is to convey this information in a side channel somehow (metadata header or something) but I feel like being explicit about it is good. |
I kind of had this same reaction, but am not so opposed to using gRPC headers for this, since the partition is already driven by gRPC header-derived info (group ID). I think it would be nice to keep the ResourceName proto 1-1 mappable to Bazel's concept of resource name defined in the REAPI, which is a property that I think would be broken by this change and might have some weird implications? In terms of explicit vs implicit I would also add that optional proto fields are almost equally explicit as optional gRPC headers in terms of how they are enforced, but maybe not as obvious from a documentation perspective. |
Okay you've convinced me. If a non-internal client wanted to write to a specific partition, using a header would be a lot easier than changing the bytestream API. I'll add the partition override to the logging context, so we can at least detect it in logs |
cf7a642
to
1f263b6
Compare
Another option instead of this PR and #8868 would be to just run an internal-only app statefulset in the baremetal cluster and point the workflow executors at that. Did we already consider that / would it work? |
Some pros and cons to that approach. Pros are that it simplifies the code and we could scale workflow-specific infrastructure separately. Cons are that we have to manage more microservices, which makes the release longer, creates more orchestration overhead, makes things harder to setup for on-prem customers, and is more expensive. I know @tylerwilliams would prefer not to spin up a new statefulset for those reasons. My thinking was that there's not a strong enough reason to have a workflow-specific stateful set yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a partition would mean that the other partitions have less space - do we need to run more proxies to compensate? also, the effective space available for workflow snapshots would be reduced I think; wonder if there's any way we could estimate how it would affect snapshot hit rates esp. for orgs that run workflows less frequently?
We have plenty of proxy disk capacity in US-SJC right now. The proxies have been running for a few months and the caches are only about 35% full, so we should keep an eye on this, but I'm not too concerned about disk capacity issues for now. |
No description provided.