Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: script qadb-info to dump some info about the QADB #68

Merged
merged 34 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
479c68a
feat: script `qadb-info` to dump some info about the QADB
c-dilks Jan 13, 2025
5ce2391
feat: add some options
c-dilks Jan 13, 2025
9f59ab4
feat: more options for printing datasets
c-dilks Jan 13, 2025
674f3b9
feat: calculate full runs' charge
c-dilks Jan 13, 2025
9943ea5
fix(ci): wrong dir
c-dilks Jan 13, 2025
8320597
fix: avoid relying on environment var
c-dilks Jan 13, 2025
4cc2d51
fix: version detection
c-dilks Jan 14, 2025
4e40b3c
ci: add `bin/` to `PATH`
c-dilks Jan 14, 2025
2a39373
feat: `TTree` table output
c-dilks Jan 15, 2025
ccc83f1
refactor: generalize so we can add a `misc` command more easily
c-dilks Jan 15, 2025
ea6b124
fix: prepend `$PATH`
c-dilks Jan 15, 2025
4cc2308
fix: don't check for datasets when `Command == 'print'`
c-dilks Jan 15, 2025
9a5d081
fix(ci): use `environ.sh` file to set `PATH`
c-dilks Jan 15, 2025
be36e4a
feat: `misc` command
c-dilks Jan 16, 2025
36f11a6
fix(ci): renamed option
c-dilks Jan 16, 2025
dae2fe5
feat: QA filter logic
c-dilks Jan 16, 2025
d583a0d
fix: porcelain -> simple
c-dilks Jan 16, 2025
2fcdc1a
fix: version detection
c-dilks Jan 16, 2025
9f47d78
fix: nevermind, doesn't work
c-dilks Jan 16, 2025
8f6fb00
fix: `JSON#load_file` -> `#parse` for compatibility with ruby 2.75
c-dilks Jan 16, 2025
f1cd4ed
fix: handle conflicts with `--golden`
c-dilks Jan 16, 2025
e65c58e
style: use '`' in printouts
c-dilks Jan 16, 2025
b68b792
feat: start query command
c-dilks Jan 16, 2025
6dd8e53
feat: lookup bins
c-dilks Jan 16, 2025
7788aa8
fix: check bins existence
c-dilks Jan 16, 2025
09a7ff2
fix: finish implementing `query` command
c-dilks Jan 17, 2025
cbe1370
doc: `qadb-info` guidance
c-dilks Jan 17, 2025
d4212e4
feat: example commands
c-dilks Jan 17, 2025
ff57011
Merge remote-tracking branch 'origin/main' into ls-datasets
c-dilks Jan 17, 2025
e27c978
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 17, 2025
042860f
feat: descriptions for defects
c-dilks Jan 17, 2025
a21dd9e
doc: move the rules away from the how-to part
c-dilks Jan 17, 2025
c33ee07
doc: caution
c-dilks Jan 17, 2025
6c54168
doc: charge
c-dilks Jan 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 26 additions & 5 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,28 +17,49 @@ concurrency:
jobs:

# get list of datasets to check
get_datasets:
info:
runs-on: ubuntu-latest
outputs:
datasets: ${{ steps.datasets.outputs.datasets }}
steps:
- name: checkout
uses: actions/checkout@v4
with: # settings needed for version number detection in qadb-info
clean: false
fetch-tags: true
fetch-depth: 0
- name: add bin to PATH
run: |
source environ.sh
echo "PATH=$PATH" | tee -a $GITHUB_ENV
- name: get data sets
id: datasets
working-directory: qadb
run: |
ls -d pass*/* | jq -Rs '{"dataset": split("\n")[:-1]}' | tee datasets.json
qadb-info print --list --no-latest --simple | jq -Rs '{"dataset": split("\n")[:-1]}' | tee datasets.json
echo datasets=$(jq -c . datasets.json) >> $GITHUB_OUTPUT
- run: qadb-info --version
- name: qadb-info summary
run: |
echo '```' >> $GITHUB_STEP_SUMMARY
qadb-info print --more | xargs -0 -I{} echo {} >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
- name: test qadb-info example commands
run: |
qadb-info --examples | grep -E '^\$' | sed 's;^\$ ;;' | while read cmd; do
echo "+ $cmd"
$cmd
done


# check consistency between Groovy and C++ APIs
test_dataset:
name: test
needs:
- get_datasets
- info
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix: ${{ fromJson(needs.get_datasets.outputs.datasets) }}
matrix: ${{ fromJson(needs.info.outputs.datasets) }}
steps:
- name: checkout
uses: actions/checkout@v4
Expand Down
62 changes: 49 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ CLAS12 experiment at Jefferson Lab
- [QADB Files and Tables](#files)
1. [How to Access the Faraday Cup Charge](#charge)
1. [Database Maintenance](#dev)
1. [QA Ground Rules](#rules)
1. [Contributions](#contributions)

<a name="use"></a>
Expand All @@ -37,7 +38,7 @@ bits to use in the filter.
> QADB, explaining _why_ the bit was set
> - The analyzer must decide whether or not data with the `Misc` defect bit
> should be excluded from their analysis
> - To help with this decision-making,
> - To help with this decision-making, use the `qadb-info misc` command, or use the
> [`Misc` summary tables are found in each dataset's directory](#files),
> which provide the comment(s) for each run

Expand All @@ -55,13 +56,32 @@ source clas12-qadb/environ.sh # or environ.csh, if using csh
<a name="info"></a>
# QA Information

## QA Ground Rules
## Information from `qadb-info`

> [!IMPORTANT]
> The following rules are enforced for the QA procedure and the resulting QADB:
> 1. The QA procedure runs on the data as they are and does not fix any of their problems.
> 2. The QADB only provides defect identification and does not provide analysis-specific decisions.
> 3. At least two people independently perform the "manual QA" part of the QA procedure, and the results are cross checked and merged.
The program `qadb-info` may be used to get information about the QADB, including:
- available data sets
- defect bits
- FC charge, filtered by QA defects chosen by the user
- query the QADB by run number, event number, and/or QA bin number

For usage guidance, just run:
```bash
qadb-info
```

> [!TIP]
> If `qadb-info` is not found, either:
> - it's at `./bin/qadb-info`, so type the full path to it
> - add `bin/` to your `$PATH`, which you can do with
> ```bash
> source environ.sh # for bash, zsh
> source environ.csh # for csh, tcsh
> ```
<!--`-->

> [!CAUTION]
> Do not call `qadb-info` in an analysis event loop, since it will run too slowly.
> Instead, use [the provided software](#software) or operate on the QADB files directly.

<a name="datasets"></a>
## Available Data Sets
Expand Down Expand Up @@ -89,7 +109,7 @@ The following tables describe the available data sets in the QADB. The columns a

> [!CAUTION]
> The QADB for older data sets may have some issues, and may even violate the
> above ground rules. It is **HIGHLY recommended** to
> [QA ground rules](#rules). It is **HIGHLY recommended** to
> [check the known important issues](/doc/issues.md) to see if any issues impact your analysis.

### Run Group A
Expand Down Expand Up @@ -312,11 +332,19 @@ chargeTree.json ─┬─ run number 1

<a name="charge"></a>
# How to Access the Faraday Cup Charge
* the charge is stored in the QADB for each QA bin, so that it is possible to
determine the amount of accumulated charge for data that satisfy your
specified QA criteria.
* see [`chargeSum.groovy`](/src/examples/chargeSum.groovy) or [`chargeSum.cpp`](/srcC/examples/chargeSum.cpp)
for usage example in an analysis event loop; basically:
The charge is stored in the QADB for each QA bin, so that it is possible to
determine the amount of accumulated charge for data that satisfy your specified
QA criteria. To calculate the charge, you'll need to add up the charge from each
bin that you include in your analysis. To help, you can either:
* use the command `qadb-info charge`; use its options to specify:
* the dataset and/or list of runs
* which defect bits that you want to allow or reject
* of the runs which only have the `Misc` bit, choose those that you want to
allow or reject
* the output format
* use the software: see [`chargeSum.groovy`](/src/examples/chargeSum.groovy)
or [`chargeSum.cpp`](/srcC/examples/chargeSum.cpp) for usage example in an
analysis event loop; basically:
* call `QADB::AccumulateCharge()` within your event loop, after your QA cuts
are satisfied; the QADB instance will keep track of the accumulated charge
you analyzed (accumulation performed per QA bin)
Expand Down Expand Up @@ -363,6 +391,14 @@ Documentation for QADB maintenance and revision
* `qadb/defect_definitions.json`, then use `util/makeDefectMarkdown.rb` to generate
Markdown table for `README.md`

<a name="rules"></a>
# QA Ground Rules

> [!IMPORTANT]
> The following rules are enforced for the QA procedure and the resulting QADB:
> 1. The QA procedure runs on the data as they are and does not fix any of their problems.
> 2. The QADB only provides defect identification and does not provide analysis-specific decisions.
> 3. At least two people independently perform the "manual QA" part of the QA procedure, and the results are cross checked and merged.

<a name="contributions"></a>
# Contributions
Expand Down
Loading
Loading