Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rockskip: Replace second long-running process with gRPC API #62734

Merged

Conversation

eseliger
Copy link
Member

@eseliger eseliger commented May 16, 2024

We want to migrate the LogReverseEach call to gRPC methods as well.
However, it isn't a great practice to have to keep this process running for potentially hours, as any server restart will have to make the entire process start over.

With one call per commit, we're not occupying a process slot on gitserver, and rollouts of gitserver should much less affect rockskip, as individual commits can be retried before being returned to the rockskip code.

One thing: I reimplemented what ChangedFiles does so that we can still implement integration tests on this level, but since gitserver is a separate service, a go unit test is likely long-term not the best test bed for these kinds of tests. Ideally we would spin up a small gitserver with that repo cloned, and a small program that runs the rockskip index function.
Unfortunately I think we have little precedence for these things, but that would be the highest confidence that things actually work E2E in production.

Closes #62100

Test plan:

Existing tests still pass, I let it index a repo locally and compared it to the results I got on main with a fresh index there.

@cla-bot cla-bot bot added the cla-signed label May 16, 2024
@github-actions github-actions bot added team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all labels May 16, 2024
@eseliger eseliger force-pushed the es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi branch 3 times, most recently from dc8d53c to 2cbe72e Compare May 16, 2024 18:03
@eseliger eseliger marked this pull request as ready for review May 16, 2024 18:07
@eseliger eseliger requested review from jtibshirani and a team May 16, 2024 18:08
@jtibshirani jtibshirani requested a review from a team May 16, 2024 18:55
@jtibshirani
Copy link
Member

This looks good to me!

Ideally we would spin up a small gitserver with that repo cloned, and a small program that runs the rockskip index function.

Indeed! I have a TODO to add a couple backend integration tests for Rockskip, that will at least give some coverage here (#62550)

Did we ever resolve @keegancsmith 's question here: #62260 (review) ? Maybe we were waiting on the perf numbers he mentioned?

@eseliger
Copy link
Member Author

Did we ever resolve @keegancsmith 's question here: #62260 (review) ? Maybe we were waiting on the perf numbers he mentioned?

Ah! No, curious to hear what the numbers are :) src-cli indexed in about the same time locally vs main, but it's a very small repo.

Copy link
Contributor

@ggilmore ggilmore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but @sourcegraph/search-platform should review

Copy link
Member

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! I feel the test coverage is as good as possible given our lack of end-to-end tests -- we've covered the ChangedFiles logic, Rockskip indexing and search, and done manual checks.

@jtibshirani
Copy link
Member

Did we ever resolve @keegancsmith 's question here: #62260 (review) ? Maybe we were waiting on the perf numbers he mentioned?

@eseliger and I chatted offline and decided we don't have a big concern here. I do think there's room to improve observability here, to be able to debug what's slow if we observe slow index times.

Base automatically changed from es/05-11-gitserveraddcommitlogapitoreplaceclient-sidecommits to main May 21, 2024 13:21
@eseliger eseliger force-pushed the es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi branch from db73857 to 3574efb Compare May 21, 2024 13:22
Copy link
Member Author

eseliger commented May 21, 2024

Merge activity

  • May 21, 9:22 AM EDT: Graphite rebased this pull request after merging its parent, because this pull request is set to merge when ready.
  • May 21, 11:45 AM EDT: @eseliger merged this pull request with Graphite.

We want to migrate the LogReverseEach call to gRPC methods as well.
However, it isn't a great practice to have to keep this process running for potentially hours, as any server restart will have to make the entire process start over.

With one call per commit, we're not occupying a process slot on gitserver, and rollouts of gitserver should much less affect rockskip, as individual commits can be retried before being returned to the rockskip code.

Closes #62100

Test plan:

Existing tests still pass, I let it index a repo locally and that was fast and I get results for symbols instantly now, but not a Rockskip expert so requesting review from the owners.
@eseliger eseliger force-pushed the es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi branch from 3574efb to 4dda696 Compare May 21, 2024 15:14
@eseliger eseliger merged commit c338a95 into main May 21, 2024
12 checks passed
@eseliger eseliger deleted the es/05-16-rockskipreplacesecondlong-runningprocesswithgrpcapi branch May 21, 2024 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gitserver: Move LogReverseEach to new gRPC pattern
3 participants