Skip to content

Scan GitHub and GitLab refs that aren't cloned by default #1918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

rgmz
Copy link
Contributor

@rgmz rgmz commented Oct 18, 2023

Description:

This fixes #1588.

In my experience, this find significantly more secrets with a negligible performance impact.

The only issue is that these secrets are technically not a part of the repository, so refactoring may be necessary to indicate that a result comes from a historical PR/MR branch. It now outputs the source pull/merge request (based on git log --source), in case the commit only exists in the PR history and not the actual repo history, which can happen when PRs are squashed.

✅ Found verified result 🐷🔑
Detector Type: AWS
Decoder Type: PLAIN
Raw result: AKIAXYZDQCEN4B6JSJQI
Resource_type: Access key
Account: 534261010715
Is_canary: true
Message: This is an AWS canary token generated at canarytokens.org, and was not set off; learn more here: https://trufflesecurity.com/canaries
Arn: arn:aws:iam::534261010715:user/canarytokens.com@@88uc3ciwodujg5f18inco2yvu
Pull Request: 2387
Commit: e47780e2e4d2dbaa3d1e63bdfe1cf00eb2c5681b
Email: Ahrav Dutta <[email protected]>
File: pkg/gitparse/gitparse_test.go
Line: 715
Link: https://github.com/trufflesecurity/trufflehog/blob/e47780e2e4d2dbaa3d1e63bdfe1cf00eb2c5681b/pkg/gitparse/gitparse_test.go#L715
Repository: https://github.com/trufflesecurity/trufflehog.git
Timestamp: 2024-02-05 17:37:58 +0000

image

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

@rgmz rgmz requested a review from a team as a code owner October 18, 2023 21:27
@rgmz rgmz force-pushed the feat/additional-refspecs branch 4 times, most recently from f64e0a3 to e2fb273 Compare October 23, 2023 21:59
@rgmz rgmz force-pushed the feat/additional-refspecs branch 2 times, most recently from b2e724c to ec2de50 Compare October 30, 2023 00:06
@rgmz rgmz force-pushed the feat/additional-refspecs branch from ec2de50 to 438418c Compare January 27, 2024 17:05
@rgmz rgmz force-pushed the feat/additional-refspecs branch from 438418c to 7cb8af2 Compare April 9, 2024 15:39
@rgmz rgmz marked this pull request as draft April 12, 2024 11:25
@rgmz rgmz force-pushed the feat/additional-refspecs branch from 7cb8af2 to 92c0f83 Compare April 12, 2024 12:53
@CLAassistant
Copy link

CLAassistant commented Apr 12, 2024

CLA assistant check
All committers have signed the CLA.

@rgmz rgmz force-pushed the feat/additional-refspecs branch 10 times, most recently from 956b38d to c9a7acd Compare April 13, 2024 22:44
@rgmz rgmz marked this pull request as ready for review April 13, 2024 22:44
@rgmz rgmz requested a review from a team as a code owner April 13, 2024 22:44
@rgmz rgmz force-pushed the feat/additional-refspecs branch from c9a7acd to 9ac8dbe Compare April 18, 2024 02:32
Copy link
Collaborator

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! Just to make sure I understand: This PR has two discrete changes, right? (Pulling down all the refs and printing the source ref of found secrets.)

@rgmz rgmz mentioned this pull request Apr 23, 2024
2 tasks
@rgmz rgmz force-pushed the feat/additional-refspecs branch from 9ac8dbe to d526837 Compare May 2, 2024 12:55
@rgmz rgmz force-pushed the feat/additional-refspecs branch from 139ebe2 to 83eee14 Compare December 2, 2024 14:03
@rgmz rgmz requested a review from a team as a code owner December 2, 2024 14:03
@rgmz rgmz force-pushed the feat/additional-refspecs branch from 83eee14 to 74653a7 Compare December 15, 2024 16:48
@rgmz rgmz force-pushed the feat/additional-refspecs branch from 74653a7 to bef8f77 Compare December 31, 2024 15:19
Copy link
Collaborator

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty neat - I'm only requesting changes because of the fact that the PR adds the expanded refs functionality unilaterally, no matter whether it's enabled. (I suspect this is a rebase artifact.)

Comment on lines +1446 to +1495
Email: sanitizer.UTF8(comment.GetUser().GetEmail()),
Timestamp: sanitizer.UTF8(comment.GetCreatedAt().String()),
Link: sanitizer.UTF8(comment.GetHTMLURL()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask why you reordered these fields everywhere, or, alternatively, how the new order was picked? This isn't blocking or anything but it does add review noise so I'm curious about whether there's a reason for it I'm not picking up on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while; IIRC, I ran into issues caused by SourceMetadataFunc being a long list of strings with an arbitrary order. I tried to order the fields semantically (based on git log) so that it was easier to grok at a glance and compare uses of SourceMetadataFunc like-for-like.

commit 8d2c7e6760b9bdc4fe36fee11cb3f28d7e469203 (HEAD -> feat/additional-refspecs)
Author: Richard Gomez <[email protected]>
Date:   Fri Apr 12 08:53:18 2024 -0400

    feat(gitparse): track ref sources

diff --git a/hack/snifftest/main.go b/hack/snifftest/main.go
...

@@ -231,19 +233,23 @@ func (c *Parser) RepoPath(
) (chan *Diff, error) {
args := []string{
"-C", source,
"--no-replace-objects",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's from this hidden comment thread. GitHub's PR UI is a nightmare.

return "Merge request #" + string(mrNumber)
}

return fmt.Sprintf("%s (hidden ref)", string(ref))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "hidden" just mean "not visible via the respective web UI?" My small concern here is that this also happens if we simply fail to parse the relevant ref because we made a mistake somewhere. What do you think of just returning the ref as-is in that case - would it be confusing?

Copy link
Contributor Author

@rgmz rgmz Jan 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "hidden" just mean "not visible via the respective web UI?"

It means "not visible with default Git behaviour"; sometimes these refs are only discoverable via the web UI or API. The term is inspired by this blog post.

The Git ref namespace (refs/*) can store almost any path, however, Git and Git-adjacent tools only look at refs/heads/* and refs/tags/*. Other refs (if they exist) aren't fetched unless you explicitly request them from a remote.

Two examples of this are:

  1. Deleted GitHub/GitLab pull request branches being stored under refs/pull/* and refs/merge-requests/*
  2. Deleted commits that are manually fetched

My small concern here is that this also happens if we simply fail to parse the relevant ref because we made a mistake somewhere. What do you think of just returning the ref as-is in that case - would it be confusing?

I think it's worth signaling that these refs are special and don't technically exist in the repository. Returning the ref as-is is already a source of confusion (#3493).

@rgmz rgmz force-pushed the feat/additional-refspecs branch 3 times, most recently from 33d70b7 to d7953e9 Compare January 1, 2025 17:13
@rgmz rgmz force-pushed the feat/additional-refspecs branch from d7953e9 to 4176d48 Compare January 11, 2025 18:57
@rgmz rgmz force-pushed the feat/additional-refspecs branch 2 times, most recently from d8a905d to 00340e1 Compare January 27, 2025 14:11
@rgmz rgmz force-pushed the feat/additional-refspecs branch 3 times, most recently from 098735c to 0d83fcd Compare February 2, 2025 23:02
@rgmz rgmz force-pushed the feat/additional-refspecs branch 2 times, most recently from 1151f68 to caa9cf4 Compare February 15, 2025 21:03
@rgmz rgmz force-pushed the feat/additional-refspecs branch from caa9cf4 to 195094e Compare March 8, 2025 18:15
@rgmz rgmz force-pushed the feat/additional-refspecs branch 2 times, most recently from 807426d to 7b08b59 Compare March 23, 2025 22:10
@rgmz rgmz force-pushed the feat/additional-refspecs branch 2 times, most recently from 73c5795 to 063a6d2 Compare April 6, 2025 16:37
@rgmz rgmz force-pushed the feat/additional-refspecs branch 2 times, most recently from 9324aa6 to 7723d2c Compare April 18, 2025 02:29
@rgmz rgmz force-pushed the feat/additional-refspecs branch 2 times, most recently from f83d1b8 to 88353f6 Compare May 1, 2025 14:11
@rgmz rgmz force-pushed the feat/additional-refspecs branch from 88353f6 to ed04b98 Compare May 20, 2025 12:52
rgmz added 2 commits May 20, 2025 19:10
'Hidden' refs, such as 'refs/pull/1004/head' may cause confusion if reported upon. GitHub, for example, will display a banner saying that the commit doesn't belong to the repository.
This parse the output of 'git log --source' and converts it to a human-readable format, IF the ref is 'hidden'.
@rgmz rgmz force-pushed the feat/additional-refspecs branch from ed04b98 to cf22aa5 Compare May 20, 2025 23:10
@rosecodym rosecodym mentioned this pull request Jun 16, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scan GitHub and GitLab refs that aren't pulled by default
6 participants