Skip to content

Commit

Permalink
Add Notes for run.sh Script in README.md (#46)
Browse files Browse the repository at this point in the history
* improve run.sh script with comments to better show the whole process off generating the score on benchmark

* Add Notes of run.sh to README.md

* Apply suggestions from code review

Co-authored-by: Elizabeth Campolongo <[email protected]>

* Recorrect sample of predictions.txt

* Create a sample of ref.csv

* Update the sample file of reference csv file

* Delete ref.csv in wrong location

* Update scores.json

* Update predictions.txt

---------

Co-authored-by: Elizabeth Campolongo <[email protected]>
  • Loading branch information
work4cs and egrace479 authored Feb 14, 2025
1 parent f5725f0 commit f8b65b9
Show file tree
Hide file tree
Showing 4 changed files with 13 additions and 4 deletions.
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,8 @@ scoring_program/
- Any requirements used by participants must be on the approved whitelist (or participants must reach out to request their addition) for security purposes.
- Scores must be saved to a `score.json` file where the keys detailed in the `Leaderboard` section of the `competition.yaml` are give as the keys for the scores.
- This full collection of files and folders is zipped as-is to upload the bundle to CodaBench.
- `run.sh` is a bash script to simulate the scoring process used for the Leaderboard on your local machine. This works by first building the docker container, and then running the bash script within the virtual environment. The script will
- Create a folder `/ref` for the csv ground truth file and a folder `/res` for the generated prediction txt file.
- Run `ingestion_program/ingestion.py` to get the predictions from your model and output the predictions to the txt file.
- Run `scoring_program/score_combined.py` to evaluate the predictions by comparing to the grouth truth. The final scores are then written to a json file.
- To test with `run.sh`, you should provide your own curated validation dataset (e.g. subsampling from train split) including images and a simulated ground truth file.
4 changes: 4 additions & 0 deletions sample_result_submission/sample/ref/ref.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
filename,hybrid_stat,ssp_indicator
2222_CAM111111_d.JPG,0,major
3333_CAM222222_d.JPG,1,minor
1234_CAM000000_d.JPG,0,mimic
6 changes: 3 additions & 3 deletions sample_result_submission/sample/res/predictions.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
2222_CAM111111_d.JPG 0.0001
3333_CAM222222_d.JPG 0.3000
1234_CAM000000_d.JPG 0.0022
2222_CAM111111_d.JPG,0.0001
3333_CAM222222_d.JPG,0.3000
1234_CAM000000_d.JPG,0.0022
2 changes: 1 addition & 1 deletion sample_result_submission/sample/scores.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"A_score_major": 0.8579234972677595, "A_score_minor": 0.07692307692307693, "A_AUC": 0.8788118018570015}
{"A_score_major_recall": 0.9205479452054794, "A_score_minor_recall": 0.14285714285714285, "A_PRC_AUC": 0.9244384534994394, "A_PRC_AUC_major": 0.9713048051966092, "A_PRC_AUC_minor": 0.13997687508229012, "mimic_recall": 0.532258064516129, "mimic_PRC_AUC": 0.7453225434507835, "challenge_score": 0.3010507452960821}

0 comments on commit f8b65b9

Please sign in to comment.