Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated migration setup #897

Merged
merged 12 commits into from Mar 5, 2024
Merged

Automated migration setup #897

merged 12 commits into from Mar 5, 2024

Conversation

nsarrazin
Copy link
Collaborator

@nsarrazin nsarrazin commented Mar 5, 2024

This PR adds a check on startup that runs migrations if needed.

How it works

Each migration is defined by it's. _id field which is unique and hardcoded. When the migration is applied we add it to the "migrationResults" collections, so we can check if it has already been run. Each migration is wrapped in a transaction to ensure consistency.

The migration check runs at the top-level of hooks.server.ts to ensure it only runs once on startup.

DB Locks

Because we can have multiple instances of chat-ui running against a single DB we need to ensure we have a lock before running migrations.

The code for lock related operations is under lock.ts with the associated unit-tests.

How it works:

  1. We have a semaphore collection with a unique index on keys
  2. We can set the lock status by using insertOne, if it fails to insert then a lock must already exist (see acquireLock)
  3. the lock has an updatedAt field with a TTL index for mongo to discard if it times out
  4. On server startup, all instances try to acquire the lock, the one that gets it can run migrations and the other will wait until the lock is released to start the rest of chat-ui.

Migrations

You can find an example migration here

Migrations can have both up (required) and down (optional) methods (though we don't use the down method for now) and optionally they can also set:

	runForHuggingChat?: "only" | "never"; // leave unspecified to run for both

We should port the migration from #841 when merging this PR.

@nsarrazin nsarrazin added enhancement New feature or request back This issue is related to the Svelte backend or the DB labels Mar 5, 2024
@nsarrazin
Copy link
Collaborator Author

nsarrazin commented Mar 5, 2024

cc @coyotte508 might be worth taking a look if you have time, especially the following files:

src/lib/server/database.ts Show resolved Hide resolved
src/lib/migrations/routines/01-update-search-assistants.ts Outdated Show resolved Hide resolved
src/lib/migrations/migrations.ts Outdated Show resolved Hide resolved
src/lib/migrations/lock.ts Outdated Show resolved Hide resolved
src/lib/migrations/lock.ts Show resolved Hide resolved
Copy link
Member

@coyotte508 coyotte508 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of an issue with spaces with continued availability - the old app can run for a bit after migrations are run.

So ideally, migrations happen in two steps: a first PR is deployed that fills in the new data (eg search tokens), and a second PR is deployed with the migration + the feature visible for users.

Just a bit of a process to think of

src/lib/migrations/migrations.ts Show resolved Hide resolved
src/lib/server/database.ts Outdated Show resolved Hide resolved
@nsarrazin
Copy link
Collaborator Author

So ideally, migrations happen in two steps: a first PR is deployed that fills in the new data (eg search tokens), and a second PR is deployed with the migration + the feature visible for users.

If we deploy the data generation before deploying the feature, isn't there a gap between the two deployments where new documents would be missed ? Not sure if I got it correctly 😅

@coyotte508
Copy link
Member

So ideally, migrations happen in two steps: a first PR is deployed that fills in the new data (eg search tokens), and a second PR is deployed with the migration + the feature visible for users.

If we deploy the data generation before deploying the feature, isn't there a gap between the two deployments where new documents would be missed ? Not sure if I got it correctly 😅

What I mean:

  • First PR fills in searchTokens when assistants are created / updated, and does nothing else
  • Second PR has the migration + the rest of the feature

So:

  • Deploy first PR
  • When users update assistants, the searchTokens fields is updated for them
  • Deploy second PR & run migration
  • Even if old code runs for a bit, no problems, as searchTokens are properly populated by both the migration and the first PR

@nsarrazin
Copy link
Collaborator Author

Oh my bad 🤦 That makes a lot of sense, will keep in mind for deploying #841 cc @mishig25

Copy link
Collaborator

@mishig25 mishig25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm !

@nsarrazin nsarrazin merged commit 4dbcbb6 into main Mar 5, 2024
3 checks passed
@nsarrazin nsarrazin deleted the backend/auto_migrations branch March 5, 2024 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back This issue is related to the Svelte backend or the DB enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants