Skip to content

Support scanning bare repos #12

Open
@tarkatronic

Description

@tarkatronic

Currently, attempting to scan a bare git repo, using --repo_path, produces an error along the lines of:

Traceback (most recent call last):
  File "/home/jwilhelm/Documents/workspace/tartufo/.venv/bin/tartufo", line 11, in <module>
    load_entry_point('tartufo', 'console_scripts', 'tartufo')()
  File "/home/jwilhelm/Documents/workspace/tartufo/tartufo/cli.py", line 58, in main
    path_exclusions=path_exclusions,
  File "/home/jwilhelm/Documents/workspace/tartufo/tartufo/scanner.py", line 287, in find_strings
    for curr_commit in repo.iter_commits(branch_name, max_count=max_depth):
  File "/home/jwilhelm/Documents/workspace/tartufo/.venv/lib64/python3.7/site-packages/git/objects/commit.py", line 278, in _iter_from_process_or_stream
    finalize_process(proc_or_stream)
  File "/home/jwilhelm/Documents/workspace/tartufo/.venv/lib64/python3.7/site-packages/git/util.py", line 332, in finalize_process
    proc.wait(**kwargs)
  File "/home/jwilhelm/Documents/workspace/tartufo/.venv/lib64/python3.7/site-packages/git/cmd.py", line 414, in wait
    raise GitCommandError(self.args, status, errstr)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git rev-list --max-count=1000000 1/head --
  stderr: 'fatal: bad revision '1/head'

This is because a bare repo is a wholly different structure from a normal clone, and produces different results from git operations. The specific problem causing the error here is this:

>>> repo = git.Repo('tartufo.git')
>>> for branch in repo.remotes.origin.fetch():
...     print(branch.name)
... 
master
split_tests
1/head
10/head
11/head
16/head
19/head
2/head
20/head
21/head
22/head
23/head
24/head
25/head
3/head
8/head
9/head
v0.0.1
v0.0.2
>>> repo = git.Repo('../tartufo')
>>> for branch in repo.remotes.origin.fetch():
...     print(branch.name)
... 
origin/master
origin/split_tests
>>>

All of the X/head references are not actual valid git revisions, and so tartufo chokes on them.

We should figure out a way to either scan all refs, or only scan actual branches.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions