Skip to content

Commit 0f183a3

Browse files
pufanyiLuodiankcz358
authored
scienceqa for full set (#32)
* Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae93 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf0 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]>
1 parent 2b8a8dc commit 0f183a3

File tree

3 files changed

+35
-3
lines changed

3 files changed

+35
-3
lines changed

lmms_eval/__main__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -307,7 +307,6 @@ def print_results(args, results):
307307
else:
308308
# use the name of the config file as run name
309309
wandb_args_dict["name"] = all_args_dict["config"].split("/")[-1].split(".")[0]
310-
311310
wandb_run = wandb.init(**wandb_args_dict)
312311
is_main_process = True
313312
else:

lmms_eval/tasks/scienceqa_img/scienceqa.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
dataset_path: lmms-lab/ScienceQA-IMG
2-
task: "scienceqa_img"
1+
dataset_path: lmms-lab/ScienceQA
2+
dataset_name: ScienceQA-FULL
3+
task: "scienceqa"
34
dataset_kwargs:
45
token: True
56
test_split: test
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
dataset_path: lmms-lab/ScienceQA
2+
dataset_name: ScienceQA-IMG
3+
task: "scienceqa_img"
4+
dataset_kwargs:
5+
token: True
6+
test_split: test
7+
output_type: generate_until
8+
doc_to_visual: !function utils.sqa_doc_to_visual
9+
doc_to_text: !function utils.sqa_doc_to_text
10+
doc_to_target: !function utils.sqa_doc_to_target
11+
generation_kwargs:
12+
max_new_tokens: 16
13+
temperature: 0
14+
do_sample: False
15+
metric_list:
16+
- metric: exact_match
17+
aggregation: mean
18+
higher_is_better: true
19+
ignore_case: true
20+
ignore_punctuation: true
21+
process_results: !function utils.sqa_process_results
22+
metadata:
23+
- version: 0.0
24+
25+
model_specific_prompt_kwargs:
26+
default:
27+
pre_prompt: ""
28+
post_prompt: "\nAnswer with the option's letter from the given choices directly."
29+
model_specific_generation_kwargs:
30+
llava:
31+
image_aspect_ratio: original
32+

0 commit comments

Comments
 (0)