Add Notes for run.sh Script in README.md (#46)

* improve run.sh script with comments to better show the whole process off generating the score on benchmark * Add Notes of run.sh to README.md * Apply suggestions from code review Co-authored-by: Elizabeth Campolongo <[email protected]> * Recorrect sample of predictions.txt * Create a sample of ref.csv * Update the sample file of reference csv file * Delete ref.csv in wrong location * Update scores.json * Update predictions.txt --------- Co-authored-by: Elizabeth Campolongo <[email protected]>
Imageomics · Feb 14, 2025 · f8b65b9 · f8b65b9
1 parent f5725f0
commit f8b65b9
Show file tree

Hide file tree

Showing 4 changed files with 13 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -63,3 +63,8 @@ scoring_program/
   - Any requirements used by participants must be on the approved whitelist (or participants must reach out to request their addition) for security purposes.
 - Scores must be saved to a `score.json` file where the keys detailed in the `Leaderboard` section of the `competition.yaml` are give as the keys for the scores.
 - This full collection of files and folders is zipped as-is to upload the bundle to CodaBench.
+- `run.sh` is a bash script to simulate the scoring process used for the Leaderboard on your local machine. This works by first building the docker container, and then running the bash script within the virtual environment. The script will
+  -  Create a folder `/ref` for the csv ground truth file and a folder `/res` for the generated prediction txt file.
+  -  Run `ingestion_program/ingestion.py` to get the predictions from your model and output the predictions to the txt file.
+  -  Run `scoring_program/score_combined.py` to evaluate the predictions by comparing to the grouth truth. The final scores are then written to a json file.
+- To test with `run.sh`, you should provide your own curated validation dataset (e.g. subsampling from train split) including images and a simulated ground truth file.
diff --git a/sample_result_submission/sample/ref/ref.csv b/sample_result_submission/sample/ref/ref.csv
@@ -0,0 +1,4 @@
+filename,hybrid_stat,ssp_indicator
+2222_CAM111111_d.JPG,0,major
+3333_CAM222222_d.JPG,1,minor
+1234_CAM000000_d.JPG,0,mimic
diff --git a/sample_result_submission/sample/res/predictions.txt b/sample_result_submission/sample/res/predictions.txt
@@ -1,3 +1,3 @@
-2222_CAM111111_d.JPG 0.0001
-3333_CAM222222_d.JPG 0.3000
-1234_CAM000000_d.JPG 0.0022
+2222_CAM111111_d.JPG,0.0001
+3333_CAM222222_d.JPG,0.3000
+1234_CAM000000_d.JPG,0.0022
diff --git a/sample_result_submission/sample/scores.json b/sample_result_submission/sample/scores.json
@@ -1 +1 @@
-{"A_score_major": 0.8579234972677595, "A_score_minor": 0.07692307692307693, "A_AUC": 0.8788118018570015}
+{"A_score_major_recall": 0.9205479452054794, "A_score_minor_recall": 0.14285714285714285, "A_PRC_AUC": 0.9244384534994394, "A_PRC_AUC_major": 0.9713048051966092, "A_PRC_AUC_minor": 0.13997687508229012, "mimic_recall": 0.532258064516129, "mimic_PRC_AUC": 0.7453225434507835, "challenge_score": 0.3010507452960821}
Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		{"A_score_major": 0.8579234972677595, "A_score_minor": 0.07692307692307693, "A_AUC": 0.8788118018570015}
		{"A_score_major_recall": 0.9205479452054794, "A_score_minor_recall": 0.14285714285714285, "A_PRC_AUC": 0.9244384534994394, "A_PRC_AUC_major": 0.9713048051966092, "A_PRC_AUC_minor": 0.13997687508229012, "mimic_recall": 0.532258064516129, "mimic_PRC_AUC": 0.7453225434507835, "challenge_score": 0.3010507452960821}