-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add_ocrbench #28
Merged
Merged
add_ocrbench #28
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@echo840 Thanks for commiting to lmms-eval~We are going to check it soon! |
Luodian
approved these changes
Mar 25, 2024
Luodian
pushed a commit
that referenced
this pull request
Apr 4, 2024
* add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa
Luodian
pushed a commit
that referenced
this pull request
Apr 4, 2024
* add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]>
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]>
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc25 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Refactor CLI evaluate function and improve error logging --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]>
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae93 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf0 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Refactor CLI evaluate function and improve error logging --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]>
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc25 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]>
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae93 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf0 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]>
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8d Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc25 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a3 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae93 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf0 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
…function (#35) * add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8d Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc25 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit d0c8c61 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit f4fd4fd Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8d Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc25 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit c7ffa8d Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc25 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
…function (#35) * add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a3 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae93 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf0 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 4b604e7 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 799a6bc Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a3 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae93 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf0 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 0f183a3 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae93 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf0 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit d0c8c61d9a23686d31c7e014f0c15d802e04ee61 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit f4fd4fd29b45436a96fe65395f0922612f598052 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 2da8f918c37495b3447b9c24e74234ad0bba8cbf Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit d0c8c61d9a23686d31c7e014f0c15d802e04ee61 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit f4fd4fd29b45436a96fe65395f0922612f598052 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 4b604e75cfde49df52e4abd90be4876ed9a1b08f Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 799a6bcb9033656115755c5169f8c342eb927d54 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 3d44977c9254d1ee5254b2ca24c8cc54984e84b0 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit a38ffeb692fbeb9deebe20f65b0f3e041823e695 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit e24607fd5725aabb7f6db5fa457b5e6a5123c199 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 4b604e75cfde49df52e4abd90be4876ed9a1b08f Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 799a6bcb9033656115755c5169f8c342eb927d54 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit d0c8c61d9a23686d31c7e014f0c15d802e04ee61 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit f4fd4fd29b45436a96fe65395f0922612f598052 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 2da8f918c37495b3447b9c24e74234ad0bba8cbf Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit d0c8c61d9a23686d31c7e014f0c15d802e04ee61 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit f4fd4fd29b45436a96fe65395f0922612f598052 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c7ffa8dee96e228c6519154d5a00742b35caa3f2 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0390783595c41232352599ab78fbe5949615e982 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 407bc2500c162d8949fbaae3d11d522afd2c9f28 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 26da729c40008f72ce3f10c932874f120f290e26 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit acbb1a1997c5159709e3b81c3f0292b2f9def109 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit b33ac32f0ff28777204eaaf27a963200024081df Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
Luodian
added a commit
that referenced
this pull request
Apr 4, 2024
* add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 4b604e75cfde49df52e4abd90be4876ed9a1b08f Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 799a6bcb9033656115755c5169f8c342eb927d54 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 3d44977c9254d1ee5254b2ca24c8cc54984e84b0 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit a38ffeb692fbeb9deebe20f65b0f3e041823e695 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit e24607fd5725aabb7f6db5fa457b5e6a5123c199 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 4b604e75cfde49df52e4abd90be4876ed9a1b08f Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 799a6bcb9033656115755c5169f8c342eb927d54 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705' * Squashed commit of the following: commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 0f183a394426d3bf88884b4e2258ab53406bc705 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 1e2ae936c90a15d684926e43a38aac86935f38c5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…lvingLMMs-Lab#33) * add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a4 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab736 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 1c11ae4 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…lvingLMMs-Lab#33) * add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e6257 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit a853223 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…lvingLMMs-Lab#33) * add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit b8ba33c Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…lvingLMMs-Lab#33) * add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8e Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 47a6675 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…lvingLMMs-Lab#33) * add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd4558 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit f125889 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…function (EvolvingLMMs-Lab#35) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f636 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 4a1183c Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7664839 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (EvolvingLMMs-Lab#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 05487a4 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f636 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 4a1183c Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 7b7f636 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 4a1183c Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…function (EvolvingLMMs-Lab#35) * add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f9 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 0d4e69f Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 2b01738 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (EvolvingLMMs-Lab#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 2f61ad5 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33) * add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f9 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 0d4e69f Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 1c9c7f9 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 0d4e69f Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…function (EvolvingLMMs-Lab#35) * add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 1c5dbd5 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit af73a51 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (EvolvingLMMs-Lab#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit accfaff Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33) * add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 1c5dbd5 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 708de71 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 1c5dbd5 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…function (EvolvingLMMs-Lab#35) * add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a4 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab736 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 1c11ae4 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit c37504a Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (EvolvingLMMs-Lab#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit cb7b75e Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33) * add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a4 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab736 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 1c11ae4 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit c2050a4 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab736 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 1c11ae4 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…function (EvolvingLMMs-Lab#35) * add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e6257 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit a853223 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6e7cd87 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (EvolvingLMMs-Lab#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit efd3510 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33) * add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e6257 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit a853223 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 49e6257 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit a853223 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811fac Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…function (EvolvingLMMs-Lab#35) * add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit b8ba33c Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7dd84f3 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (EvolvingLMMs-Lab#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit a781057 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33) * add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit b8ba33c Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 6d570ac Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa5 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit b8ba33c Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…function (EvolvingLMMs-Lab#35) * add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8e Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 47a6675 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7eefb7e Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (EvolvingLMMs-Lab#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 81d7b9f Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33) * add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8e Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 47a6675 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit d8a4f8e Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit 47a6675 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
…function (EvolvingLMMs-Lab#35) * add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd4558 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit f125889 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6a4b81b Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (EvolvingLMMs-Lab#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit fab8704 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33) * add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd4558 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit f125889 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit ebe4eb8 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (EvolvingLMMs-Lab#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd4558 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (EvolvingLMMs-Lab#30) * mmmu_test * black commit f125889 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (EvolvingLMMs-Lab#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d10 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7664839d1765e09b06e6cf59c12cb895ef71c40e Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 05487a4e1f1dd1ab20d087399a47502716929a9b Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 1f8780df5e89ee50f349361bb5ea7351a73e0c19 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7664839d1765e09b06e6cf59c12cb895ef71c40e Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 05487a4e1f1dd1ab20d087399a47502716929a9b Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 2b01738ba36ee632712135d38f45ea40f1c1323a Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 2f61ad5c3da7411eccda597afadcb64d573c5193 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 04a4076120c4d337d70992b82bf2b4fa4c700359 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit b3c423a93d944a2621c1fa4192616af048e5b77c Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 1c5354e09283b03f1c0068d39b82f8bfa73d4184 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 2b01738ba36ee632712135d38f45ea40f1c1323a Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 2f61ad5c3da7411eccda597afadcb64d573c5193 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit af73a51ca7940095310f725544bd3473b67b412c Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit accfaffdc9ba3002757d1ee167063c7aa6a12394 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit f6f7adae7485defcca27deafb2b19b37733233c6 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit af73a51ca7940095310f725544bd3473b67b412c Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit accfaffdc9ba3002757d1ee167063c7aa6a12394 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit c37504a11db9763a0cb65e1cfc9081d8e60aa0fc Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit cb7b75e6f96a9b933557c570bea72a12b7800014 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit d887d8a25654322aa62cff6e94b39c262ebc8ae0 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit 96b17d51b831b62da66685444f97188e1af9ad7a Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit b94afc7866a362feb80b7e9a757a6cf2dbd78aa8 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit c37504a11db9763a0cb65e1cfc9081d8e60aa0fc Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit cb7b75e6f96a9b933557c570bea72a12b7800014 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6e7cd871ca881e5002bbaa3dd7774d34fce12811 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit efd3510236c5ca6948d65a7150fd7a5925902f3d Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 7037fd2991af7afe522d9492878cde4b2699bc43 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6e7cd871ca881e5002bbaa3dd7774d34fce12811 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit efd3510236c5ca6948d65a7150fd7a5925902f3d Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7dd84f337cf1ce906dfeb92118e6c2998707a79a Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit a781057ad07b0a60c7ef682f864be598b2436b7c Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 04a4076120c4d337d70992b82bf2b4fa4c700359 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit b3c423a93d944a2621c1fa4192616af048e5b77c Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit c3b0da62994f646141456b60baaa3ee5713f38fa Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7dd84f337cf1ce906dfeb92118e6c2998707a79a Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit a781057ad07b0a60c7ef682f864be598b2436b7c Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7eefb7e3bb827b0e784ed0395e4125c535b6eeef Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 81d7b9fdf3e662405e0ea358900a4c6981cc502f Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit ae76855543ee127e79809843378a18aa06d90261 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7eefb7e3bb827b0e784ed0395e4125c535b6eeef Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 81d7b9fdf3e662405e0ea358900a4c6981cc502f Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6a4b81baa42b29457cbaea42043723c2332ad5ba Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit fab87047e683d9982ea0f544feb3e2fce4e1fbf4 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 6e6fe00bf9d5fcfd351c164285c569e53f38e280 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit 938c7729a9176e459531cbd00bb6f8d69691258b Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 2412a0072cc8840593c90e5bdeff64aa8f375bdc Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6a4b81baa42b29457cbaea42043723c2332ad5ba Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit fab87047e683d9982ea0f544feb3e2fce4e1fbf4 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- …
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7664839d1765e09b06e6cf59c12cb895ef71c40e Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 05487a4e1f1dd1ab20d087399a47502716929a9b Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 1f8780df5e89ee50f349361bb5ea7351a73e0c19 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7664839d1765e09b06e6cf59c12cb895ef71c40e Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 05487a4e1f1dd1ab20d087399a47502716929a9b Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 6ee856b61bbb0156dd72d454430cd01a246b5e61 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 4a1183c563835c366ea54a28e1a5761a193b6704 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit dbba2fe6447b0dfd4bb89a368f62178f2b253006 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit c09b621195878300417315a97efdec25e67dd7f5 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 864a1aba26388276b7e57717b89520fcc77b3f62 Merge: ab898e4 ad8d9da Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit ab898e4fd30bf83888125d48b80bc86b01cb5d39 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit c0ea54d49cb65b747d7e8fccac75838acabe05db Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 2b01738ba36ee632712135d38f45ea40f1c1323a Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 2f61ad5c3da7411eccda597afadcb64d573c5193 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 04a4076120c4d337d70992b82bf2b4fa4c700359 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit b3c423a93d944a2621c1fa4192616af048e5b77c Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 1c5354e09283b03f1c0068d39b82f8bfa73d4184 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 2b01738ba36ee632712135d38f45ea40f1c1323a Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 2f61ad5c3da7411eccda597afadcb64d573c5193 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 9d06741f31439e6ac34764612664467239b63253 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 0d4e69f54d996672ab0471531837004f80ba9b10 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 0dc9a47afe9a61214f11053dae5641716052f30f Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit af73a51ca7940095310f725544bd3473b67b412c Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit accfaffdc9ba3002757d1ee167063c7aa6a12394 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit f6f7adae7485defcca27deafb2b19b37733233c6 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit af73a51ca7940095310f725544bd3473b67b412c Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit accfaffdc9ba3002757d1ee167063c7aa6a12394 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 708de71d7c634c51ade4443f7a8590dca74561ed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit e19ec39d72c2781f1f2d174094d3acfb4ada7861 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit c37504a11db9763a0cb65e1cfc9081d8e60aa0fc Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit cb7b75e6f96a9b933557c570bea72a12b7800014 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit d887d8a25654322aa62cff6e94b39c262ebc8ae0 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit 96b17d51b831b62da66685444f97188e1af9ad7a Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit b94afc7866a362feb80b7e9a757a6cf2dbd78aa8 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit c37504a11db9763a0cb65e1cfc9081d8e60aa0fc Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit cb7b75e6f96a9b933557c570bea72a12b7800014 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c' * Squashed commit of the following: commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit c2050a435b47dfba638b6ba6a1600515a9f61b4c Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 27ab7369c986607ad08e356e3bd951864c845e22 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6e7cd871ca881e5002bbaa3dd7774d34fce12811 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit efd3510236c5ca6948d65a7150fd7a5925902f3d Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 7037fd2991af7afe522d9492878cde4b2699bc43 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6e7cd871ca881e5002bbaa3dd7774d34fce12811 Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit efd3510236c5ca6948d65a7150fd7a5925902f3d Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '49e625761a6853595641a0a411c96168490dabad' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 49e625761a6853595641a0a411c96168490dabad Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit a853223fa8da0ec1d59040768c896c1526b10dff Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 15f168756d8f92f53dea87548efe606d0d1401b5 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit 09d42b879158738f5484f31d514c6b400a418551 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit e8110aacf87bb0450db298b0993164765e0a624f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit e811faca3743a9b0c865144145198cc5eea21393 Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7dd84f337cf1ce906dfeb92118e6c2998707a79a Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit a781057ad07b0a60c7ef682f864be598b2436b7c Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 04a4076120c4d337d70992b82bf2b4fa4c700359 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit b3c423a93d944a2621c1fa4192616af048e5b77c Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit c3b0da62994f646141456b60baaa3ee5713f38fa Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7dd84f337cf1ce906dfeb92118e6c2998707a79a Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit a781057ad07b0a60c7ef682f864be598b2436b7c Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed' * Squashed commit of the following: commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032 Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit 6d570ac1d98a03585c8119ccb362e13ab2172fed Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit fbb7aa57856f800d6c18413318830f4bbc6c8157 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7eefb7e3bb827b0e784ed0395e4125c535b6eeef Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 81d7b9fdf3e662405e0ea358900a4c6981cc502f Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 380494bb2417fae1bcc1535ad8b67df7af667619 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit e46b937aeeed45f5dd574b852459bfb416d165fd Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit ae76855543ee127e79809843378a18aa06d90261 Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 7eefb7e3bb827b0e784ed0395e4125c535b6eeef Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit 81d7b9fdf3e662405e0ea358900a4c6981cc502f Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43' * Squashed commit of the following: commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit d8a4f8ef094e37c987863da971cbc51637b92b43 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499 Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit 992be447a9fdf701fc910177653017e3978bf56d Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit baf78ea27df4dfe5d88bc2abca707e117a4f9661 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit e323545d9f3a5e0f2219618a4b024aea3ff6e353 Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit dbe09071a986c68e6b2b60cbde501da8d498535f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit 844a47e5d49c71e5297decdf7510d8a1a214f934 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 89545d0517eb5891710f2d7191ca7b650723701e Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
* add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6a4b81baa42b29457cbaea42043723c2332ad5ba Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit fab87047e683d9982ea0f544feb3e2fce4e1fbf4 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update API configuration and file paths * Refactor evaluate_by_chatgpt function in utils.py * Add hallusion_output_vd_model.json to .gitignore * Add timeout to API request * Refactor file path generation and remove unnecessary suffix in log samples output names * Refactor code and add output path handling * Update lmms-eval API and add new models and datasets * Refactor directory structure for RefCOCO+ and RefCOCOg datasets * Fix error logging in get_eval and parse_score functions * Update .gitignore and mme.yaml * Squashed commit of the following: commit 6e6fe00bf9d5fcfd351c164285c569e53f38e280 Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:43:28 2024 +0800 black commit 938c7729a9176e459531cbd00bb6f8d69691258b Author: jzhang38 <[email protected]> Date: Fri Feb 2 13:42:03 2024 +0800 adapt qwen to sqa, gqa, ai2d, docvqa commit 2412a0072cc8840593c90e5bdeff64aa8f375bdc Author: Li Bo <[email protected]> Date: Thu Feb 1 16:20:27 2024 +0800 [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35) * add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co-authored-by: Fanyi Pu <[email protected]> Co-authored-by: kcz358 <[email protected]> * Remove scienceqa_img task configuration * eval scienceqa with no images --------- Co-authored-by: Bo Li <[email protected]> Co-authored-by: kcz358 <[email protected]> * Update hb_doc_to_text function to remove unnecessary line break * Add Fuyu model and update OtterHD model * Refactor model response handling and fix image processing bug * Refactor flatten method to support only getting the first element * Add support for specifying timezone in datetime string Update flatten method in OtterHD class Update get_datetime_str function in utils.py * Fix condition for checking wandb_args_dict in __main__.py * Commented out assertions for batch size in Fuyu model * Add warning message for existing output file * Fix batch size issue in OtterHD model * Squashed commit of the following: commit 6a4b81baa42b29457cbaea42043723c2332ad5ba Author: Li Bo <[email protected]> Date: Wed Jan 31 16:00:22 2024 +0800 [Datasets] add hallubench (#34) * Add hallu bench * Fix hall_b gpt eval bugs --------- Co-authored-by: kcz358 <[email protected]> commit fab87047e683d9982ea0f544feb3e2fce4e1fbf4 Author: Li Bo <[email protected]> Date: Wed Jan 31 14:23:15 2024 +0800 [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33) * add fuyu * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836' * Squashed commit of the following: commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a Author: kcz358 <[email protected]> Date: Tue Jan 30 19:39:57 2024 +0800 Add hallu bench commit ebe4eb8dffcce06f7be393478d35d76de82a3836 Author: Pu Fanyi <[email protected]> Date: Tue Jan 30 14:52:51 2024 +0800 scienceqa for full set (#32) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration * Update generation kwargs for LMMS tasks * Update lmms_eval MME task configuration and utils * Update generation_kwargs in lmms_eval tasks * Update doc_to_text function in coco and okvqa tasks * Add COCO 2017 version * Update task name in coco_test2017.yaml * Squashed commit of the following: commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d Author: Zhang Peiyuan <[email protected]> Date: Mon Jan 29 22:41:33 2024 +0800 Add/mmmu test (#30) * mmmu_test * black commit f1258892713f588f8d65826f9141e38048f5ff31 Author: Li Bo <[email protected]> Date: Sun Jan 28 22:19:13 2024 +0800 [Dataset Check] dataset check and add wandb logging (#29) * Remove unused code and configuration file * Remove docvqa.yaml and update vizwizvqa.yaml * lint * Add dataset_kwargs to vizwizvqa.yaml * Add dataset_kwargs to vizwizvqa.yaml * textvqa (#27) * Update textvqa.yaml and utils.py * Fix YAML formatting in textvqa.yaml and remove unused files * remove useless matric * add textvqa val & test * Update progress bar description in evaluator.py * Update submission file names in VizWizVQA tasks * Update output path to include log samples suffix * Update submission file paths in OKVQA and VizWizVQA tasks * Refactor llava-in-the-wild.yaml and utils.py * Update metric for llava evaluation * Refactor logging message in Task class * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b' * Fix formatting issues and add progress bar closing statements * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml * Update tqdm progress bar in OtterHD model * Squashed commit of the following: commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * Fix error handling in loading YAML config files * Squashed commit of the following: commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8 Author: kcz358 <[email protected]> Date: Sun Jan 28 12:41:40 2024 +0800 Fix key bugs commit eae210c3700a59b7d5cc9de46fcb855f443096aa Author: kcz358 <[email protected]> Date: Sun Jan 28 09:46:19 2024 +0800 Black lint commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae Merge: ab898e4 fb209e4 Author: kcz358 <[email protected]> Date: Sun Jan 28 09:45:31 2024 +0800 Merge branch 'main' into kc/list_tasks_num commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed Author: kcz358 <[email protected]> Date: Sun Jan 28 09:44:23 2024 +0800 Enable list all tasks num commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f Author: kcz358 <[email protected]> Date: Sun Jan 28 09:41:32 2024 +0800 Exclude train yaml file in the task list commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b Author: Zhang Peiyuan <[email protected]> Date: Sun Jan 28 02:04:57 2024 +0800 Add InfoVQA, DocVQA, and QwenVL (#28) * add mmme * black * add model specific prompt and gen kwargs * black * add yaml config to supprot multi-model eval * print table at the end * refactor multi model code * add chartqa * black * add ai2d * black * update chartqa * blacl * update ai2d dataset * black * add qwenvl * add infovqa and docvqa * List task #num sorted * Update prompt messages for image-related tasks * Delete unused task configuration files * Remove coco_train.yaml configuration file * Update task name in mmmu.yaml * Fix error message for missing tasks * Add wandb import and integration --------- Co…
kangreen0210
pushed a commit
to kangreen0210/LIME
that referenced
this pull request
Oct 6, 2024
add_ocrbench
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Before you open a pull-request, please check if a similar issue already exists or has been closed before.
When you open a pull-request, please be sure to include the following
Thank you for your contributions!
Add the evaluation of OCRBench.