Skip to content

Commit

Permalink
docs: update README and a few docstrings
Browse files Browse the repository at this point in the history
  • Loading branch information
sarina committed Sep 26, 2022
1 parent 3e3c7cd commit 7c8dd22
Show file tree
Hide file tree
Showing 5 changed files with 127 additions and 27 deletions.
118 changes: 102 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,35 @@

just a dumb little collection of scripts i've found useful.

* ghelpers.py: github helper functions
scripts require a GITHUB_AUTH token to be defined in the local environment

* apply-labels.py: for a given github organization, applies a label uniformly
## helper functions

* `github_helpers.py`: Functions that help call the GitHub API or perform
manipulation of local filesystem files using the `git` command

* `shell_helpers.py`: Functions that call local filesystem commands, such as
`mv`, `cp`, and the base implementation of the `git` command

## general-use/could be kinda-useful for you?

* `licensing-check.py`: generates a json report of an org's repo's licenses

```
Usage: licensing-check.py [-h] [-P] org
Creates a report of which repos have which licenses.
positional arguments:
org Name of the organization
optional arguments:
-h, --help show this help message and exit
-P, --exclude-private
Exclude private repos from this org
```

* `apply-labels.py`: for a given github organization, applies a label uniformly
across all repos

```
Expand All @@ -26,8 +52,9 @@ optional arguments:
Exclude private repos from this org
```

* export-gh-issues.py: Export GitHub issues to json or csv. One day
will script over beta projects (when the API is ready).

* `export-gh-issues.py`: Export GitHub issues from one or more repos to json or
csv. Not working for repos with >30 issues.

```
> python -m export-gh-issues.py -h
Expand All @@ -51,23 +78,82 @@ optional arguments:
Only return GitHub issues with this label.
```

* Useful GitHub fields
* `parse-pr-query.py`: returns a list of prs generated from a GH query over PRs

* As mentioned above, by default the script filters the raw GitHub-returned json to a small number of what I deem are useful fields. These are: "url", "\
number", "title", "body", "created_at", "updated_at", "user" (github username).
```
usage: parse-pr-query.py [-h] [-Q QUERY] [-B]
* Additionally, the following two fields are returned as nested json (if json output is chosen) or as a flattened list (if csv output is chosen): "label\
s" and "assignees"
Takes a query over PRs and writes a json file that contains [repo_name,
pr_branch_name] for each PR result in the query. Probably will do funky
stuff if you don't include 'is:pr' in your query.
* See "sample-output/" folder for some sample outputs.
optional arguments:
-h, --help show this help message and exit
* add_depr_wkflw_issues.py: non-generalized script to, for every repo in your org,
copy a reference file onto a new branch and issue a pull request. Collects pull
request URLs into an output json.
-q QUERY, --query QUERY
GitHub query, as you'd do over in the UI. Example:
'is:pr author:YourUsername'. Defaults to Sarina's open
PRs over the openedx github org.
* bulk_merge_prs.py: given a json list of PR urls, attempts to merge them. outputs
-B, --branch-name Adds the name of the branch the PR is being made from
(head ref) to the output list for the PR
```

* `bulk_merge_prs.py`: given a json list of PR urls, attempts to merge them. outputs
a json list of failures.

* parse_output.py: parses log output from add_depr_wkflw_issues if needed.
* `checkout_all.py`: goes through a github org and checks out all its repos to a
local directory. If the repo is already checked out, switches to the repo's
default branch and pulls all upstream changes.

* `copy_file_to_repos.py`: Goes through all repos in an org, clones them, makes
a new branch, copies specific files, commits them, creates a pull request,
and merges the pull request. Requires viewing the file and changing a bunch
of variables at the end of the file.

* `fetch_gh_request_limit.py`: shows how many requests you've got left.
important: doesn't show secondary rate limit (which is not discoverable)

## more specific to problems i've been solving

You might be able to take inspiration from some of these scripts but you'll
definitely need to replace some hardcoded logic.

* `replace_string_with_another.py`: for each repo in your org, looks for a given
string. If the string exists, switches to a new branch, replaces the string
with a new string, commits changes, and opens a PR. Everything currently
hard-coded, but would not be terribly difficult to make this one generic.

* `replace_string_existing_branch.py`: Assumes you've run
`replace_string_with_another`, thus repos already have your working branch
defined, and you don't want to open a new set of PRs. Checks out existing
branch (or re-creates if existing was already merged) and performs another
string swap with the new strings, and makes a new commit.

Many things hard-coded but as above, wouldn't be super hard to genericize.

* `replace_string_run_edx_lint.py`: Basically, executes a command on every
repo, on an existing branch. But this is a bit more specific - it first
rolls back the previous commit, runs the command, then re-runs the original
command. Description: For each repo in your org, first looks to see if one
of the edx_lint files is present; if so, rolls back your branch, runs
edx_lint, then re-runs the core logic to swap strings defined in
`replace_string_with_another`.

This could potentially be made more generic, but is so specific I can't
imagine it's necessary.

* `add_depr_wkflw_issues.py`: non-generalized script to, for every repo in your org,
copy a reference file onto a new branch and issue a pull request. Collects pull
request URLs into an output json. Some associated files:

* `parse_output.py`: parses log output from `add_depr_wkflw_issues` if needed.

* `retry_failed_depr_wkflow_issues.py`: retries prs that failed to post correctly; takes
in a set of info required to re-post them. Could probably be made more generic; this is
good if you hit rate limits and have a list of ready-to-go branches that need PRs.

* `revise_depr_wkflw_issues.py`: honestly not sure, this was made to correct
some mistakes and is messy and undocumented. don't look at it.


* retry_failed_depr_wkflow_issues.py: self-explanatory
7 changes: 7 additions & 0 deletions export-gh-issues.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@
"""
Usage:
python -m export-gh-issues.py -h
Pulls all issues from given github repo(s).
TODO:
I bet this doesn't work when a repo has >30 issues because pagination is not
handled.
"""

import argparse
Expand Down
10 changes: 5 additions & 5 deletions parse_my_prs.py → parse-pr-query.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
"""
usage: parse_my_prs.py [-h] [-Q QUERY] [-B]
usage: parse-pr-query.py [-h] [-Q QUERY] [-B]
Takes a json output of a query over PRs and writes a json file that contains
[repo_name, pr_branch_name] for each PR result in the query. Probably will do
funky stuff if you don't include 'is:pr' in your query.
Takes a query over PRs and writes a json file that contains [repo_name,
pr_branch_name] for each PR result in the query. Probably will do funky stuff if
you don't include 'is:pr' in your query.
optional arguments:
-h, --help show this help message and exit
Expand Down Expand Up @@ -73,7 +73,7 @@ def parse_prs(search_query, branch_name=False):
sys.exit("*** ERROR ***\nGITHUB_TOKEN must be defined in this environment")

parser = argparse.ArgumentParser(
description="Takes a json output of a query over PRs and writes a json file that contains [repo_name, pr_branch_name] for each PR result in the query. Probably will do funky stuff if you don't include 'is:pr' in your query."
description="Takes a query over PRs and writes a json file that contains [repo_name, pr_branch_name] for each PR result in the query. Probably will do funky stuff if you don't include 'is:pr' in your query."
)

parser.add_argument(
Expand Down
11 changes: 10 additions & 1 deletion retry_failed_depr_wkflow_issues.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,17 @@
#!/usr/bin/env python3
"""
Takes the failed PR posts from add_depr_wkflw_issues and retries them.
Retries prs that failed to post correctly; takes
in a set of info required to re-post them. Could probably be made more
generic; this is good if you hit rate limits and have a list of ready-to-go
branches that need PRs.
Currently assumes that cloning, branching, file copy, commit, and push all
succeeded, so this script simply retries making the PRs.
Must provide a qualified path to the file that has the PR data
(ex: /Users/<uname>/gh-scripting/output/failed.json); this data of PRs
that failed to be executed correctly, each entry being a 5-tuple of:
(org, repo name, branch name, default branch name, dict that has one,
both, or none of the keys "title" and "body" of the PR)
"""
import json
import logging
Expand Down
8 changes: 3 additions & 5 deletions revise_depr_wkflw_issues.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
#!/usr/bin/env python3
"""
Usage:
python -m add-depr-wkflw-issues.py
python -m revise-depr-wkflw-issues.py
Requires:
GITHUB_AUTH token in local environment
Description:
Transfers reference workflow template to all repos in the org. Additionally,
if the org doesn't have issues enabled, transfers a reference issue template
and issue configuration (if issues are enabled, inheriting a more open
reference issue template set will suffice).
honestly not sure, this was made to correct some mistakes in
`add_depr_wkflw_issues` and is messy and undocumented. don't look at it.
"""

import json
Expand Down

0 comments on commit 7c8dd22

Please sign in to comment.