Skip to content

chore(ingestion): group FOR UPDATE behind env var #32275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 19, 2025

Conversation

nickbest-ph
Copy link
Contributor

Important

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Problem

The consistency gained for the SELECT FOR UPDATE on Group table is not necessary and is causing resource contention on the DB

Changes

Put the FOR UPDATE part of the Select behind an env var that can be controlled

Does this work well for both Cloud and self-hosted?

How did you test this code?

local tests, it is behind env var

@nickbest-ph nickbest-ph requested a review from pl May 16, 2025 17:25
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR introduces a new configuration option DISABLE_GROUP_SELECT_FOR_UPDATE to control PostgreSQL's SELECT FOR UPDATE locking in group queries, aiming to reduce database contention.

  • Added DISABLE_GROUP_SELECT_FOR_UPDATE boolean config in plugin-server/src/types.ts and config.ts with default value false
  • Modified upsertGroup in properties-updater.ts to accept optional forUpdate parameter (defaults to true)
  • Updated group processing in process-event.ts to pass !DISABLE_GROUP_SELECT_FOR_UPDATE to control locking behavior
  • Potential concern: Retry logic in upsertGroup could lead to infinite recursion if race conditions persist

4 file(s) reviewed, no comment(s)
Edit PR Review Bot Settings | Greptile

): Promise<void> {
const result = await this.postgres.query(
): Promise<number> {
const result = await this.postgres.query<{ version: string }>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we returning version as string instead of number? since we are casting it after

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a nodejs problem. The postgres library will return a string because the column is bigint (64-bit signed) but nodejs uses double precision floats for all numbers, so there's a possibility the value from the db will overflow.

It's a bit involved to refactor it, as we're using that value in JSON later and it doesn't play well with JS bigints. We live with it because the max integer in JS is ~2^53, so it's very unlikely we'll overflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, makes total sense! 👍 rest of PR looks good

})
const createdAt = DateTime.min(group?.created_at || DateTime.now(), timestamp)
const version = (group?.version || 0) + 1
const expectedVersion = (group?.version || 0) + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we have to always evaluate version? Is it because it can be Null? What are the scenarioss in which it can be null

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the group does not exist group?.version will return undefined.

actualVersion = updatedVersion
// Track the disparity between the version on the database and the version we expected
// Without races, the returned version should be only +1 what we expected
const versionDisparity = updatedVersion - expectedVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np: could be wrapped in a modulo in case some unexpected stuff happens, although I believe this is likely impossible to happen.

Copy link
Contributor

@jose-sequeira jose-sequeira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! just have the string casting thing open, but it's a small thing

@pl pl merged commit b1a18ca into master May 19, 2025
78 checks passed
@pl pl deleted the fix-remove-for-update-fetch-group branch May 19, 2025 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants