Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read replica support for Postgres and MySQL datastores #1878

Merged
merged 3 commits into from
Jul 1, 2024

Conversation

josephschorr
Copy link
Member

No description provided.

@github-actions github-actions bot added area/CLI Affects the command line area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels Apr 25, 2024
@josephschorr josephschorr marked this pull request as ready for review April 26, 2024 18:22
@josephschorr josephschorr requested review from vroldanbet and a team as code owners April 26, 2024 18:22
@benny-yamagata
Copy link

Just to clarify, this update will allow spice to use multiple reader nodes for postgres and mysql? So if I have 3 readers I would be able to pass in the entries for all of them to be used?

@josephschorr
Copy link
Member Author

@benny-yamagata Yes, the replica URI parameter is a list of URIs and the system will round robin between them

Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation works with the assumption individual hosts will be listed, but folks most of the time would use replicas behind a load-balancer in order to be able to scale read traffic when needed. I don't think the implementation will work for that more common use case because the datastore snapshot reader does not use a single transaction.

internal/datastore/mysql/datastore.go Outdated Show resolved Hide resolved
index uint32,
options ...Option,
) (datastore.ReadOnlyDatastore, error) {
ds, err := newMySQLDatastore(ctx, url, int(index), options...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched online if there is any URI query parameter we can add to enforce read only, but didn't find any. Users should make sure the credentials provided for the replica have read-only permissions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really necessary; the datastore itself ensures it is read only

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it install the read only datastore proxy on the replica?

internal/datastore/mysql/datastore_test.go Outdated Show resolved Hide resolved
internal/datastore/postgres/postgres.go Outdated Show resolved Hide resolved
internal/datastore/postgres/postgres.go Show resolved Hide resolved
internal/datastore/proxy/replicated.go Outdated Show resolved Hide resolved
internal/datastore/proxy/replicated.go Show resolved Hide resolved
internal/datastore/proxy/replicated.go Show resolved Hide resolved
return rr.chosenReader.LookupNamespacesWithNames(ctx, nsNames)
}

func (rr *replicatedReader) chooseSource(ctx context.Context) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think returning the chosen reader in chooseSource would better encapsulate this logic.

internal/datastore/proxy/replicated.go Show resolved Hide resolved
@josephschorr
Copy link
Member Author

I don't think the implementation will work for that more common use case because the datastore snapshot reader does not use a single transaction.

It uses a single connection, which means it should stay connected to the same replica

@ecordell
Copy link
Contributor

ecordell commented May 2, 2024

It uses a single connection, which means it should stay connected to the same replica

I possibly just missed it, but it looked like you were using a pgxpool.Conn for the read replicas which cycles actual connections out from under itself.

@josephschorr
Copy link
Member Author

It uses a single connection, which means it should stay connected to the same replica

I possibly just missed it, but it looked like you were using a pgxpool.Conn for the read replicas which cycles actual connections out from under itself.

Yeah, I traced it and it does use the pool. We'll have to do something else

@josephschorr josephschorr marked this pull request as draft May 2, 2024 17:00
@github-actions github-actions bot locked and limited conversation to collaborators Jun 17, 2024
@josephschorr josephschorr reopened this Jun 21, 2024
@josephschorr josephschorr marked this pull request as ready for review June 24, 2024 19:08
Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 on the overall strategy, this should work. My main concerns are:

  • exposing strict via the CLI, I don't see a reason we should do that
  • cached checked revision proxy may race
  • missing tests on strictReaderQueryFuncs, clarification how in progress surfaces in a replica
  • clarification on Watch API behaviour, do we have a test? I understand it requires no extra work, it would just lag behind the primary
  • inconsistent refactor of common datastore flags

pkg/cmd/datastore/datastore.go Outdated Show resolved Hide resolved
var legacyConnPool ConnPoolConfig
RegisterConnPoolFlagsWithPrefix(flagSet, "datastore-conn", DefaultReadConnPool(), &legacyConnPool)
deprecateUnifiedConnFlags(flagSet)
RegisterConnPoolFlagsWithPrefix(flagSet, "datastore-conn-pool-read", &legacyConnPool, &opts.ReadConnPool)
RegisterConnPoolFlagsWithPrefix(flagSet, "datastore-conn-pool-write", DefaultWriteConnPool(), &opts.WriteConnPool)
RegisterConnPoolFlagsWithPrefix(flagSet, "datastore-read-replica-conn-pool", DefaultReadConnPool(), &opts.ReadReplicaConnPool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming pattern differs from existing conn pools. Should be datastore-conn-pool-read-replica.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted all the read replica flags to start with datastore-read-replica to be explicit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can tell you that as someone usually tinkering with these flags, having to remember 2 different naming patterns is not going to make my life easier

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a tradeoff, but I still think having all the read replica flags sharing a common prefix makes more sense. Perhaps we should ask?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the tradeoff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would you ever want to configure read replica pool to be not the same as the read pool?

Always: you can't use the read pool with replicas

Do we think we could use a heuristic inside SpiceDB to come up with these values rather than even exposing them at all?

Incredibly unlikely... We need to know that a replica is a replica

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always: you can't use the read pool with replicas

Configure them differently, not use the same pool.
For example, if I'm using a RDS with a read replica, the read replica is likely the same size as my primary likely with the same max connections.

Incredibly unlikely... We need to know that a replica is a replica

I'm not advocating altering the system to be unaware of read replicas. I'm advocating for the system being smart enough to discover the right size of connection pools because users are very likely to pick non-ideal values themselves

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, if I'm using a RDS with a read replica, the read replica is likely the same size as my primary likely with the same max connections.

Which still means the read pool for non-replicas will want to be configured differently, as we also have a write pool for the primary.

I'm advocating for the system being smart enough to discover the right size of connection pools because users are very likely to pick non-ideal values themselves

And... how would it do that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think a simplified version of adaptive concurrency limits (e.g. similar to tcp congestion control) would work? Basically use connection errors to adaptively discover the right number of connections for you pool -- something like this could even make the system robust to changes in the replicas or the size of the database.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but it is IMO outside of the scope of this change (as it would apply to all pools, not just replicas), and would require a significant amount of testing.

pkg/cmd/datastore/datastore.go Show resolved Hide resolved
pkg/cmd/datastore/datastore.go Outdated Show resolved Hide resolved
Comment on lines 531 to 552
mysql.RevisionQuantization(opts.RevisionQuantization),
mysql.MaxRevisionStalenessPercent(opts.MaxRevisionStalenessPercent),
mysql.TablePrefix(opts.TablePrefix),
mysql.WatchBufferLength(opts.WatchBufferLength),
mysql.WatchBufferWriteTimeout(opts.WatchBufferWriteTimeout),
mysql.WithEnablePrometheusStats(opts.EnableDatastoreMetrics),
mysql.MaxRetries(uint8(opts.MaxRetries)),
mysql.OverrideLockWaitTimeout(1),
mysql.CredentialsProviderName(opts.ReadReplicaCredentialsProviderName),
Copy link
Contributor

@vroldanbet vroldanbet Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see it done for MySQL

jschorr edit since I can't reply: this is done

internal/datastore/proxy/replicated.go Outdated Show resolved Hide resolved
internal/datastore/proxy/replicated.go Outdated Show resolved Hide resolved
internal/datastore/postgres/strictreader.go Show resolved Hide resolved
internal/datastore/postgres/strictreader.go Show resolved Hide resolved
internal/datastore/postgres/strictreader.go Show resolved Hide resolved
@josephschorr
Copy link
Member Author

Updated

Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more remarks but seems ready to go

internal/datastore/context.go Outdated Show resolved Hide resolved
internal/datastore/postgres/postgres.go Show resolved Hide resolved
internal/datastore/proxy/cachedcheckrev.go Show resolved Hide resolved
internal/datastore/proxy/cachedcheckrev.go Show resolved Hide resolved
internal/datastore/proxy/replicated.go Outdated Show resolved Hide resolved
replica datastore.ReadOnlyDatastore
primary datastore.Datastore

chosePrimary bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
chosePrimary bool
chosePrimaryInTest bool

This is accomplished by adding a datastore proxy that selects a read replica for each SnapshotReader, in a round robin fashion, and ensures that the revision being requested is available on the replica. This extra check does add some latency to the overall operation, but it should provide for the safest approach for using read replicas

Fixes authzed#1321
Fixes authzed#1320
@josephschorr
Copy link
Member Author

Updated

This will allow replicas behind load balancers to be supported (just in Postgres for now)
Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great work! 🚀

@josephschorr josephschorr added this pull request to the merge queue Jul 1, 2024
Merged via the queue into authzed:main with commit 9456ce0 Jul 1, 2024
22 checks passed
@josephschorr josephschorr deleted the read-replica-support branch July 1, 2024 17:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/CLI Affects the command line area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants