Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add_ocrbench #28

Merged
merged 4 commits into from
Mar 25, 2024
Merged

add_ocrbench #28

merged 4 commits into from
Mar 25, 2024

Conversation

echo840
Copy link
Contributor

@echo840 echo840 commented Mar 24, 2024

Before you open a pull-request, please check if a similar issue already exists or has been closed before.

When you open a pull-request, please be sure to include the following

  • A descriptive title: [xxx] XXXX
  • A detailed description

Thank you for your contributions!

Add the evaluation of OCRBench.

@Luodian
Copy link
Contributor

Luodian commented Mar 24, 2024

@echo840 Thanks for commiting to lmms-eval~We are going to check it soon!

@pufanyi pufanyi requested review from pufanyi and Luodian March 25, 2024 13:07
@Luodian Luodian merged commit 9dfb53a into EvolvingLMMs-Lab:main Mar 25, 2024
1 check passed
@pufanyi pufanyi removed their request for review March 25, 2024 13:11
Luodian pushed a commit that referenced this pull request Apr 4, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* add qwenvl

* add infovqa and docvqa
Luodian pushed a commit that referenced this pull request Apr 4, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* add qwenvl

* add infovqa and docvqa
Luodian added a commit that referenced this pull request Apr 4, 2024
* Remove unused code and configuration file

* Remove docvqa.yaml and update vizwizvqa.yaml

* lint

* Add dataset_kwargs to vizwizvqa.yaml

* Add dataset_kwargs to vizwizvqa.yaml

* textvqa (#27)

* Update textvqa.yaml and utils.py

* Fix YAML formatting in textvqa.yaml and remove unused files

* remove useless matric

* add textvqa val & test

* Update progress bar description in evaluator.py

* Update submission file names in VizWizVQA tasks

* Update output path to include log samples suffix

* Update submission file paths in OKVQA and VizWizVQA tasks

* Refactor llava-in-the-wild.yaml and utils.py

* Update metric for llava evaluation

* Refactor logging message in Task class

* Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

* Fix formatting issues and add progress bar closing statements

* Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

* Update tqdm progress bar in OtterHD model

* Squashed commit of the following:

commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 26da729c40008f72ce3f10c932874f120f290e26
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit acbb1a1997c5159709e3b81c3f0292b2f9def109
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit b33ac32f0ff28777204eaaf27a963200024081df
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit f80465f
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* Fix error handling in loading YAML config files

* Squashed commit of the following:

commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 12:41:40 2024 +0800

    Fix key bugs

commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 26da729c40008f72ce3f10c932874f120f290e26
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit acbb1a1997c5159709e3b81c3f0292b2f9def109
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit b33ac32f0ff28777204eaaf27a963200024081df
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit f80465f
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* List task #num sorted

* Update prompt messages for image-related tasks

* Delete unused task configuration files

* Remove coco_train.yaml configuration file

* Update task name in mmmu.yaml

* Fix error message for missing tasks

* Add wandb import and integration

---------

Co-authored-by: Fanyi Pu <[email protected]>
Co-authored-by: kcz358 <[email protected]>
Luodian added a commit that referenced this pull request Apr 4, 2024
* Remove unused code and configuration file

* Remove docvqa.yaml and update vizwizvqa.yaml

* lint

* Add dataset_kwargs to vizwizvqa.yaml

* Add dataset_kwargs to vizwizvqa.yaml

* textvqa (#27)

* Update textvqa.yaml and utils.py

* Fix YAML formatting in textvqa.yaml and remove unused files

* remove useless matric

* add textvqa val & test

* Update progress bar description in evaluator.py

* Update submission file names in VizWizVQA tasks

* Update output path to include log samples suffix

* Update submission file paths in OKVQA and VizWizVQA tasks

* Refactor llava-in-the-wild.yaml and utils.py

* Update metric for llava evaluation

* Refactor logging message in Task class

* Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

* Fix formatting issues and add progress bar closing statements

* Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

* Update tqdm progress bar in OtterHD model

* Squashed commit of the following:

commit eae210c3700a59b7d5cc9de46fcb855f443096aa
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit 0a403e6
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* Fix error handling in loading YAML config files

* Squashed commit of the following:

commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 12:41:40 2024 +0800

    Fix key bugs

commit eae210c3700a59b7d5cc9de46fcb855f443096aa
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit 0a403e6
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* List task #num sorted

* Update prompt messages for image-related tasks

* Delete unused task configuration files

* Remove coco_train.yaml configuration file

* Update task name in mmmu.yaml

* Fix error message for missing tasks

* Add wandb import and integration

---------

Co-authored-by: Fanyi Pu <[email protected]>
Co-authored-by: kcz358 <[email protected]>
Luodian added a commit that referenced this pull request Apr 4, 2024
* Remove unused code and configuration file

* Remove docvqa.yaml and update vizwizvqa.yaml

* lint

* Add dataset_kwargs to vizwizvqa.yaml

* Add dataset_kwargs to vizwizvqa.yaml

* textvqa (#27)

* Update textvqa.yaml and utils.py

* Fix YAML formatting in textvqa.yaml and remove unused files

* remove useless matric

* add textvqa val & test

* Update progress bar description in evaluator.py

* Update submission file names in VizWizVQA tasks

* Update output path to include log samples suffix

* Update submission file paths in OKVQA and VizWizVQA tasks

* Refactor llava-in-the-wild.yaml and utils.py

* Update metric for llava evaluation

* Refactor logging message in Task class

* Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

* Fix formatting issues and add progress bar closing statements

* Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

* Update tqdm progress bar in OtterHD model

* Squashed commit of the following:

commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 26da729c40008f72ce3f10c932874f120f290e26
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit acbb1a1997c5159709e3b81c3f0292b2f9def109
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit b33ac32f0ff28777204eaaf27a963200024081df
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit f80465f
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* Fix error handling in loading YAML config files

* Squashed commit of the following:

commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 12:41:40 2024 +0800

    Fix key bugs

commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 26da729c40008f72ce3f10c932874f120f290e26
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit acbb1a1997c5159709e3b81c3f0292b2f9def109
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit b33ac32f0ff28777204eaaf27a963200024081df
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit f80465f
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* List task #num sorted

* Update prompt messages for image-related tasks

* Delete unused task configuration files

* Remove coco_train.yaml configuration file

* Update task name in mmmu.yaml

* Fix error message for missing tasks

* Add wandb import and integration

* Update generation kwargs for LMMS tasks

* Update lmms_eval MME task configuration and utils

* Update generation_kwargs in lmms_eval tasks

* Update doc_to_text function in coco and okvqa tasks

* Add COCO 2017 version

* Update task name in coco_test2017.yaml

* Squashed commit of the following:

commit 0390783
Author: Zhang Peiyuan <[email protected]>
Date:   Mon Jan 29 22:41:33 2024 +0800

    Add/mmmu test (#30)

    * mmmu_test

    * black

commit 407bc25
Author: Li Bo <[email protected]>
Date:   Sun Jan 28 22:19:13 2024 +0800

    [Dataset Check] dataset check and add wandb logging (#29)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Refactor CLI evaluate function and improve error logging

---------

Co-authored-by: Fanyi Pu <[email protected]>
Co-authored-by: kcz358 <[email protected]>
Luodian added a commit that referenced this pull request Apr 4, 2024
* Remove unused code and configuration file

* Remove docvqa.yaml and update vizwizvqa.yaml

* lint

* Add dataset_kwargs to vizwizvqa.yaml

* Add dataset_kwargs to vizwizvqa.yaml

* textvqa (#27)

* Update textvqa.yaml and utils.py

* Fix YAML formatting in textvqa.yaml and remove unused files

* remove useless matric

* add textvqa val & test

* Update progress bar description in evaluator.py

* Update submission file names in VizWizVQA tasks

* Update output path to include log samples suffix

* Update submission file paths in OKVQA and VizWizVQA tasks

* Refactor llava-in-the-wild.yaml and utils.py

* Update metric for llava evaluation

* Refactor logging message in Task class

* Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

* Fix formatting issues and add progress bar closing statements

* Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

* Update tqdm progress bar in OtterHD model

* Squashed commit of the following:

commit eae210c3700a59b7d5cc9de46fcb855f443096aa
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit 0a403e6
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* Fix error handling in loading YAML config files

* Squashed commit of the following:

commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 12:41:40 2024 +0800

    Fix key bugs

commit eae210c3700a59b7d5cc9de46fcb855f443096aa
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit 0a403e6
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* List task #num sorted

* Update prompt messages for image-related tasks

* Delete unused task configuration files

* Remove coco_train.yaml configuration file

* Update task name in mmmu.yaml

* Fix error message for missing tasks

* Add wandb import and integration

* Update generation kwargs for LMMS tasks

* Update lmms_eval MME task configuration and utils

* Update generation_kwargs in lmms_eval tasks

* Update doc_to_text function in coco and okvqa tasks

* Add COCO 2017 version

* Update task name in coco_test2017.yaml

* Squashed commit of the following:

commit 1e2ae93
Author: Zhang Peiyuan <[email protected]>
Date:   Mon Jan 29 22:41:33 2024 +0800

    Add/mmmu test (#30)

    * mmmu_test

    * black

commit 10bbaf0
Author: Li Bo <[email protected]>
Date:   Sun Jan 28 22:19:13 2024 +0800

    [Dataset Check] dataset check and add wandb logging (#29)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Refactor CLI evaluate function and improve error logging

---------

Co-authored-by: Fanyi Pu <[email protected]>
Co-authored-by: kcz358 <[email protected]>
Luodian added a commit that referenced this pull request Apr 4, 2024
* Remove unused code and configuration file

* Remove docvqa.yaml and update vizwizvqa.yaml

* lint

* Add dataset_kwargs to vizwizvqa.yaml

* Add dataset_kwargs to vizwizvqa.yaml

* textvqa (#27)

* Update textvqa.yaml and utils.py

* Fix YAML formatting in textvqa.yaml and remove unused files

* remove useless matric

* add textvqa val & test

* Update progress bar description in evaluator.py

* Update submission file names in VizWizVQA tasks

* Update output path to include log samples suffix

* Update submission file paths in OKVQA and VizWizVQA tasks

* Refactor llava-in-the-wild.yaml and utils.py

* Update metric for llava evaluation

* Refactor logging message in Task class

* Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

* Fix formatting issues and add progress bar closing statements

* Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

* Update tqdm progress bar in OtterHD model

* Squashed commit of the following:

commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 26da729c40008f72ce3f10c932874f120f290e26
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit acbb1a1997c5159709e3b81c3f0292b2f9def109
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit b33ac32f0ff28777204eaaf27a963200024081df
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit f80465f
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* Fix error handling in loading YAML config files

* Squashed commit of the following:

commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 12:41:40 2024 +0800

    Fix key bugs

commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 26da729c40008f72ce3f10c932874f120f290e26
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit acbb1a1997c5159709e3b81c3f0292b2f9def109
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit b33ac32f0ff28777204eaaf27a963200024081df
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit f80465f
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* List task #num sorted

* Update prompt messages for image-related tasks

* Delete unused task configuration files

* Remove coco_train.yaml configuration file

* Update task name in mmmu.yaml

* Fix error message for missing tasks

* Add wandb import and integration

* Update generation kwargs for LMMS tasks

* Update lmms_eval MME task configuration and utils

* Update generation_kwargs in lmms_eval tasks

* Update doc_to_text function in coco and okvqa tasks

* Add COCO 2017 version

* Update task name in coco_test2017.yaml

* Squashed commit of the following:

commit 0390783
Author: Zhang Peiyuan <[email protected]>
Date:   Mon Jan 29 22:41:33 2024 +0800

    Add/mmmu test (#30)

    * mmmu_test

    * black

commit 407bc25
Author: Li Bo <[email protected]>
Date:   Sun Jan 28 22:19:13 2024 +0800

    [Dataset Check] dataset check and add wandb logging (#29)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Remove scienceqa_img task configuration

* eval scienceqa with no images

---------

Co-authored-by: Bo Li <[email protected]>
Co-authored-by: kcz358 <[email protected]>
Luodian added a commit that referenced this pull request Apr 4, 2024
* Remove unused code and configuration file

* Remove docvqa.yaml and update vizwizvqa.yaml

* lint

* Add dataset_kwargs to vizwizvqa.yaml

* Add dataset_kwargs to vizwizvqa.yaml

* textvqa (#27)

* Update textvqa.yaml and utils.py

* Fix YAML formatting in textvqa.yaml and remove unused files

* remove useless matric

* add textvqa val & test

* Update progress bar description in evaluator.py

* Update submission file names in VizWizVQA tasks

* Update output path to include log samples suffix

* Update submission file paths in OKVQA and VizWizVQA tasks

* Refactor llava-in-the-wild.yaml and utils.py

* Update metric for llava evaluation

* Refactor logging message in Task class

* Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

* Fix formatting issues and add progress bar closing statements

* Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

* Update tqdm progress bar in OtterHD model

* Squashed commit of the following:

commit eae210c3700a59b7d5cc9de46fcb855f443096aa
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit 0a403e6
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* Fix error handling in loading YAML config files

* Squashed commit of the following:

commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 12:41:40 2024 +0800

    Fix key bugs

commit eae210c3700a59b7d5cc9de46fcb855f443096aa
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:46:19 2024 +0800

    Black lint

commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
Merge: ab898e4 fb209e4
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:45:31 2024 +0800

    Merge branch 'main' into kc/list_tasks_num

commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:44:23 2024 +0800

    Enable list all tasks num

commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
Author: kcz358 <[email protected]>
Date:   Sun Jan 28 09:41:32 2024 +0800

    Exclude train yaml file in the task list

commit 0a403e6
Author: Zhang Peiyuan <[email protected]>
Date:   Sun Jan 28 02:04:57 2024 +0800

    Add InfoVQA, DocVQA, and QwenVL (#28)

    * add mmme

    * black

    * add model specific prompt and gen kwargs

    * black

    * add yaml config to supprot multi-model eval

    * print table at the end

    * refactor multi model code

    * add chartqa

    * black

    * add ai2d

    * black

    * update chartqa

    * blacl

    * update ai2d dataset

    * black

    * add qwenvl

    * add infovqa and docvqa

* List task #num sorted

* Update prompt messages for image-related tasks

* Delete unused task configuration files

* Remove coco_train.yaml configuration file

* Update task name in mmmu.yaml

* Fix error message for missing tasks

* Add wandb import and integration

* Update generation kwargs for LMMS tasks

* Update lmms_eval MME task configuration and utils

* Update generation_kwargs in lmms_eval tasks

* Update doc_to_text function in coco and okvqa tasks

* Add COCO 2017 version

* Update task name in coco_test2017.yaml

* Squashed commit of the following:

commit 1e2ae93
Author: Zhang Peiyuan <[email protected]>
Date:   Mon Jan 29 22:41:33 2024 +0800

    Add/mmmu test (#30)

    * mmmu_test

    * black

commit 10bbaf0
Author: Li Bo <[email protected]>
Date:   Sun Jan 28 22:19:13 2024 +0800

    [Dataset Check] dataset check and add wandb logging (#29)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    ---------

    Co-authored-by: Fanyi Pu <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Remove scienceqa_img task configuration

* eval scienceqa with no images

---------

Co-authored-by: Bo Li <[email protected]>
Co-authored-by: kcz358 <[email protected]>
Luodian added a commit that referenced this pull request Apr 4, 2024
* add fuyu

* Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit c7ffa8d
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0390783
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 407bc25
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file
Luodian added a commit that referenced this pull request Apr 4, 2024
* add fuyu

* Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

* Squashed commit of the following:

commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 0f183a3
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 1e2ae93
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 10bbaf0
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file
Luodian added a commit that referenced this pull request Apr 4, 2024
…function (#35)

* add fuyu

* Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit c7ffa8d
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0390783
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 407bc25
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit d0c8c61
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit f4fd4fd
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c7ffa8d
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0390783
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 407bc25
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit c7ffa8d
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0390783
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 407bc25
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
Luodian added a commit that referenced this pull request Apr 4, 2024
…function (#35)

* add fuyu

* Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

* Squashed commit of the following:

commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 0f183a3
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 1e2ae93
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 10bbaf0
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 4b604e7
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 799a6bc
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

    * Squashed commit of the following:

    commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 0f183a3
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 1e2ae93
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 10bbaf0
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 0f183a3
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 1e2ae93
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 10bbaf0
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
Luodian added a commit that referenced this pull request Apr 4, 2024
* add fuyu

* Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0390783595c41232352599ab78fbe5949615e982
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit d0c8c61d9a23686d31c7e014f0c15d802e04ee61
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit f4fd4fd29b45436a96fe65395f0922612f598052
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0390783595c41232352599ab78fbe5949615e982
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0390783595c41232352599ab78fbe5949615e982
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 2da8f918c37495b3447b9c24e74234ad0bba8cbf
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0390783595c41232352599ab78fbe5949615e982
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit d0c8c61d9a23686d31c7e014f0c15d802e04ee61
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit f4fd4fd29b45436a96fe65395f0922612f598052
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 0390783595c41232352599ab78fbe5949615e982
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 26da729c40008f72ce3f10c932874f120f290e26
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit acbb1a1997c5159709e3b81c3f0292b2f9def109
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit b33ac32f0ff28777204eaaf27a963200024081df
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 26da729c40008f72ce3f10c932874f120f290e26
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit acbb1a1997c5159709e3b81c3f0292b2f9def109
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit b33ac32f0ff28777204eaaf27a963200024081df
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
Luodian added a commit that referenced this pull request Apr 4, 2024
* add fuyu

* Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

* Squashed commit of the following:

commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 0f183a394426d3bf88884b4e2258ab53406bc705
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 1e2ae936c90a15d684926e43a38aac86935f38c5
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 4b604e75cfde49df52e4abd90be4876ed9a1b08f
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 799a6bcb9033656115755c5169f8c342eb927d54
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

    * Squashed commit of the following:

    commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 0f183a394426d3bf88884b4e2258ab53406bc705
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 1e2ae936c90a15d684926e43a38aac86935f38c5
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 0f183a394426d3bf88884b4e2258ab53406bc705
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 1e2ae936c90a15d684926e43a38aac86935f38c5
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 3d44977c9254d1ee5254b2ca24c8cc54984e84b0
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit a38ffeb692fbeb9deebe20f65b0f3e041823e695
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit e24607fd5725aabb7f6db5fa457b5e6a5123c199
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

    * Squashed commit of the following:

    commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 0f183a394426d3bf88884b4e2258ab53406bc705
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 1e2ae936c90a15d684926e43a38aac86935f38c5
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 4b604e75cfde49df52e4abd90be4876ed9a1b08f
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit 799a6bcb9033656115755c5169f8c342eb927d54
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

        * Squashed commit of the following:

        commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 0f183a394426d3bf88884b4e2258ab53406bc705
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 1e2ae936c90a15d684926e43a38aac86935f38c5
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
Luodian added a commit that referenced this pull request Apr 4, 2024
* add fuyu

* Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0390783595c41232352599ab78fbe5949615e982
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit d0c8c61d9a23686d31c7e014f0c15d802e04ee61
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit f4fd4fd29b45436a96fe65395f0922612f598052
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0390783595c41232352599ab78fbe5949615e982
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 26da729c40008f72ce3f10c932874f120f290e26
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit acbb1a1997c5159709e3b81c3f0292b2f9def109
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit b33ac32f0ff28777204eaaf27a963200024081df
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0390783595c41232352599ab78fbe5949615e982
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 2da8f918c37495b3447b9c24e74234ad0bba8cbf
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 26da729c40008f72ce3f10c932874f120f290e26
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit acbb1a1997c5159709e3b81c3f0292b2f9def109
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit b33ac32f0ff28777204eaaf27a963200024081df
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0390783595c41232352599ab78fbe5949615e982
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit d0c8c61d9a23686d31c7e014f0c15d802e04ee61
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit f4fd4fd29b45436a96fe65395f0922612f598052
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit 'c7ffa8dee96e228c6519154d5a00742b35caa3f2'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit c7ffa8dee96e228c6519154d5a00742b35caa3f2
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 26da729c40008f72ce3f10c932874f120f290e26
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit acbb1a1997c5159709e3b81c3f0292b2f9def109
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit b33ac32f0ff28777204eaaf27a963200024081df
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 0390783595c41232352599ab78fbe5949615e982
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 407bc2500c162d8949fbaae3d11d522afd2c9f28
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit 'f80465fd0f30781c8c36b46c1d6d7bba751f9e33'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 26da729c40008f72ce3f10c932874f120f290e26
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit acbb1a1997c5159709e3b81c3f0292b2f9def109
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit b33ac32f0ff28777204eaaf27a963200024081df
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit 2df0ce76ef836be1cb8ffbf3c854fe05563647b0
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit af6c7a2b8c2959495dc351e6f6eb2a442efe4e94
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 26da729c40008f72ce3f10c932874f120f290e26
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit acbb1a1997c5159709e3b81c3f0292b2f9def109
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit b33ac32f0ff28777204eaaf27a963200024081df
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit f80465fd0f30781c8c36b46c1d6d7bba751f9e33
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
Luodian added a commit that referenced this pull request Apr 4, 2024
* add fuyu

* Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

* Squashed commit of the following:

commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 0f183a394426d3bf88884b4e2258ab53406bc705
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 1e2ae936c90a15d684926e43a38aac86935f38c5
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 4b604e75cfde49df52e4abd90be4876ed9a1b08f
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 799a6bcb9033656115755c5169f8c342eb927d54
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

    * Squashed commit of the following:

    commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 0f183a394426d3bf88884b4e2258ab53406bc705
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 1e2ae936c90a15d684926e43a38aac86935f38c5
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 0f183a394426d3bf88884b4e2258ab53406bc705
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 1e2ae936c90a15d684926e43a38aac86935f38c5
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 3d44977c9254d1ee5254b2ca24c8cc54984e84b0
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit a38ffeb692fbeb9deebe20f65b0f3e041823e695
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit e24607fd5725aabb7f6db5fa457b5e6a5123c199
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

    * Squashed commit of the following:

    commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 0f183a394426d3bf88884b4e2258ab53406bc705
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 1e2ae936c90a15d684926e43a38aac86935f38c5
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 4b604e75cfde49df52e4abd90be4876ed9a1b08f
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit 799a6bcb9033656115755c5169f8c342eb927d54
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '0f183a394426d3bf88884b4e2258ab53406bc705'

        * Squashed commit of the following:

        commit b81ed2ce4d0e226df7a41bddd82fe1f9d46a27fc
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 0f183a394426d3bf88884b4e2258ab53406bc705
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 1e2ae936c90a15d684926e43a38aac86935f38c5
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 10bbaf01c0a4164b6f1d2628367befccf8f39c24
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '0a403e6f5e17c70a50983c83a132edf0fdcd98de'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 0a403e6f5e17c70a50983c83a132edf0fdcd98de
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* add qwenvl

* add infovqa and docvqa
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* add qwenvl

* add infovqa and docvqa
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add mmme

* black

* add model specific prompt and gen kwargs

* black

* add yaml config to supprot multi-model eval

* print table at the end

* refactor multi model code

* add chartqa

* black

* add ai2d

* black

* update chartqa

* blacl

* update ai2d dataset

* black

* add qwenvl

* add infovqa and docvqa
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…lvingLMMs-Lab#33)

* add fuyu

* Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

* Squashed commit of the following:

commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit c2050a4
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 27ab736
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 1c11ae4
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…lvingLMMs-Lab#33)

* add fuyu

* Merge commit '49e625761a6853595641a0a411c96168490dabad'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 49e6257
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811fac
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 15f168756d8f92f53dea87548efe606d0d1401b5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811fac
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit da7a8df
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit a853223
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811fac
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811fac
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…lvingLMMs-Lab#33)

* add fuyu

* Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

* Squashed commit of the following:

commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 6d570ac
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit fbb7aa5
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit b8ba33c
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…lvingLMMs-Lab#33)

* add fuyu

* Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit d8a4f8e
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit a2b4a2a
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 47a6675
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…lvingLMMs-Lab#33)

* add fuyu

* Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

* Squashed commit of the following:

commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit ebe4eb8
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d10
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d10
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0fd4558
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit f125889
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d10
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d10
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…function (EvolvingLMMs-Lab#35)

* add fuyu

* Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 7b7f636
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 6ee856b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 4a1183c
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 7664839
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (EvolvingLMMs-Lab#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 05487a4
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33)

    * add fuyu

    * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 7b7f636
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (EvolvingLMMs-Lab#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 6ee856b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (EvolvingLMMs-Lab#30)

            * mmmu_test

            * black

        commit 4a1183c
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (EvolvingLMMs-Lab#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 7b7f636
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 6ee856b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 4a1183c
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…function (EvolvingLMMs-Lab#35)

* add fuyu

* Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

* Squashed commit of the following:

commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 1c9c7f9
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 9d06741
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 0d4e69f
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 2b01738
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (EvolvingLMMs-Lab#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 2f61ad5
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33)

    * add fuyu

    * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 1c9c7f9
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (EvolvingLMMs-Lab#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 9d06741
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (EvolvingLMMs-Lab#30)

            * mmmu_test

            * black

        commit 0d4e69f
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (EvolvingLMMs-Lab#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 1c9c7f9
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 9d06741
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 0d4e69f
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…function (EvolvingLMMs-Lab#35)

* add fuyu

* Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 708de71
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit e19ec39
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 1c5dbd5
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit af73a51
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (EvolvingLMMs-Lab#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit accfaff
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33)

    * add fuyu

    * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 708de71
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (EvolvingLMMs-Lab#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit e19ec39
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (EvolvingLMMs-Lab#30)

            * mmmu_test

            * black

        commit 1c5dbd5
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (EvolvingLMMs-Lab#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 708de71
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit e19ec39
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 1c5dbd5
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…function (EvolvingLMMs-Lab#35)

* add fuyu

* Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

* Squashed commit of the following:

commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit c2050a4
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 27ab736
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 1c11ae4
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit c37504a
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (EvolvingLMMs-Lab#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit cb7b75e
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33)

    * add fuyu

    * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

    * Squashed commit of the following:

    commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c2050a4
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (EvolvingLMMs-Lab#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 27ab736
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (EvolvingLMMs-Lab#30)

            * mmmu_test

            * black

        commit 1c11ae4
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (EvolvingLMMs-Lab#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit c2050a4
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 27ab736
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 1c11ae4
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…function (EvolvingLMMs-Lab#35)

* add fuyu

* Merge commit '49e625761a6853595641a0a411c96168490dabad'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 49e6257
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811fac
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 15f168756d8f92f53dea87548efe606d0d1401b5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811fac
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit da7a8df
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit a853223
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811fac
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811fac
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 6e7cd87
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (EvolvingLMMs-Lab#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit efd3510
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33)

    * add fuyu

    * Merge commit '49e625761a6853595641a0a411c96168490dabad'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 49e6257
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (EvolvingLMMs-Lab#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811fac
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811fac
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit da7a8df
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (EvolvingLMMs-Lab#30)

            * mmmu_test

            * black

        commit a853223
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (EvolvingLMMs-Lab#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811fac
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 15f168756d8f92f53dea87548efe606d0d1401b5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811fac
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 49e6257
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811fac
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 15f168756d8f92f53dea87548efe606d0d1401b5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811fac
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit da7a8df
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit a853223
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811fac
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811fac
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…function (EvolvingLMMs-Lab#35)

* add fuyu

* Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

* Squashed commit of the following:

commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 6d570ac
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit fbb7aa5
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit b8ba33c
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 7dd84f3
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (EvolvingLMMs-Lab#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit a781057
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33)

    * add fuyu

    * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 6d570ac
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (EvolvingLMMs-Lab#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit fbb7aa5
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (EvolvingLMMs-Lab#30)

            * mmmu_test

            * black

        commit b8ba33c
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (EvolvingLMMs-Lab#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 6d570ac
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit fbb7aa5
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit b8ba33c
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…function (EvolvingLMMs-Lab#35)

* add fuyu

* Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit d8a4f8e
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit a2b4a2a
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 47a6675
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 7eefb7e
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (EvolvingLMMs-Lab#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 81d7b9f
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33)

    * add fuyu

    * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit d8a4f8e
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (EvolvingLMMs-Lab#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit a2b4a2a
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (EvolvingLMMs-Lab#30)

            * mmmu_test

            * black

        commit 47a6675
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (EvolvingLMMs-Lab#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit d8a4f8e
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit a2b4a2a
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit 47a6675
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
…function (EvolvingLMMs-Lab#35)

* add fuyu

* Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

* Squashed commit of the following:

commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit ebe4eb8
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d10
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d10
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0fd4558
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit f125889
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d10
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d10
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 6a4b81b
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (EvolvingLMMs-Lab#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit fab8704
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (EvolvingLMMs-Lab#33)

    * add fuyu

    * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

    * Squashed commit of the following:

    commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit ebe4eb8
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (EvolvingLMMs-Lab#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d10
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d10
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0fd4558
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (EvolvingLMMs-Lab#30)

            * mmmu_test

            * black

        commit f125889
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (EvolvingLMMs-Lab#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d10
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d10
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit ebe4eb8
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (EvolvingLMMs-Lab#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (EvolvingLMMs-Lab#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d10
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d10
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0fd4558
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (EvolvingLMMs-Lab#30)

        * mmmu_test

        * black

    commit f125889
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (EvolvingLMMs-Lab#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (EvolvingLMMs-Lab#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d10
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d10
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (EvolvingLMMs-Lab#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 4a1183c563835c366ea54a28e1a5761a193b6704
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 7664839d1765e09b06e6cf59c12cb895ef71c40e
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 05487a4e1f1dd1ab20d087399a47502716929a9b
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 4a1183c563835c366ea54a28e1a5761a193b6704
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 4a1183c563835c366ea54a28e1a5761a193b6704
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 1f8780df5e89ee50f349361bb5ea7351a73e0c19
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 4a1183c563835c366ea54a28e1a5761a193b6704
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 7664839d1765e09b06e6cf59c12cb895ef71c40e
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit 05487a4e1f1dd1ab20d087399a47502716929a9b
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 4a1183c563835c366ea54a28e1a5761a193b6704
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit c09b621195878300417315a97efdec25e67dd7f5
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 864a1aba26388276b7e57717b89520fcc77b3f62
                Merge: ab898e4 ad8d9da
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit c0ea54d49cb65b747d7e8fccac75838acabe05db
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit c09b621195878300417315a97efdec25e67dd7f5
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 864a1aba26388276b7e57717b89520fcc77b3f62
                Merge: ab898e4 ad8d9da
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit c0ea54d49cb65b747d7e8fccac75838acabe05db
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

* Squashed commit of the following:

commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47afe9a61214f11053dae5641716052f30f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47afe9a61214f11053dae5641716052f30f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 9d06741f31439e6ac34764612664467239b63253
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 0d4e69f54d996672ab0471531837004f80ba9b10
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 2b01738ba36ee632712135d38f45ea40f1c1323a
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 2f61ad5c3da7411eccda597afadcb64d573c5193
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 9d06741f31439e6ac34764612664467239b63253
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 0d4e69f54d996672ab0471531837004f80ba9b10
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47afe9a61214f11053dae5641716052f30f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47afe9a61214f11053dae5641716052f30f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 9d06741f31439e6ac34764612664467239b63253
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 0d4e69f54d996672ab0471531837004f80ba9b10
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 04a4076120c4d337d70992b82bf2b4fa4c700359
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit b3c423a93d944a2621c1fa4192616af048e5b77c
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 1c5354e09283b03f1c0068d39b82f8bfa73d4184
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 9d06741f31439e6ac34764612664467239b63253
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 0d4e69f54d996672ab0471531837004f80ba9b10
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 2b01738ba36ee632712135d38f45ea40f1c1323a
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit 2f61ad5c3da7411eccda597afadcb64d573c5193
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

        * Squashed commit of the following:

        commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 9d06741f31439e6ac34764612664467239b63253
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 0d4e69f54d996672ab0471531837004f80ba9b10
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 0dc9a47afe9a61214f11053dae5641716052f30f
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 0dc9a47afe9a61214f11053dae5641716052f30f
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 708de71d7c634c51ade4443f7a8590dca74561ed
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit af73a51ca7940095310f725544bd3473b67b412c
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit accfaffdc9ba3002757d1ee167063c7aa6a12394
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 708de71d7c634c51ade4443f7a8590dca74561ed
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 708de71d7c634c51ade4443f7a8590dca74561ed
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit f6f7adae7485defcca27deafb2b19b37733233c6
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 708de71d7c634c51ade4443f7a8590dca74561ed
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit af73a51ca7940095310f725544bd3473b67b412c
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit accfaffdc9ba3002757d1ee167063c7aa6a12394
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 708de71d7c634c51ade4443f7a8590dca74561ed
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit dbe09071a986c68e6b2b60cbde501da8d498535f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit 844a47e5d49c71e5297decdf7510d8a1a214f934
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit 992be447a9fdf701fc910177653017e3978bf56d
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit dbe09071a986c68e6b2b60cbde501da8d498535f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit 844a47e5d49c71e5297decdf7510d8a1a214f934
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

* Squashed commit of the following:

commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 27ab7369c986607ad08e356e3bd951864c845e22
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit c37504a11db9763a0cb65e1cfc9081d8e60aa0fc
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit cb7b75e6f96a9b933557c570bea72a12b7800014
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

    * Squashed commit of the following:

    commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 27ab7369c986607ad08e356e3bd951864c845e22
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 27ab7369c986607ad08e356e3bd951864c845e22
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit d887d8a25654322aa62cff6e94b39c262ebc8ae0
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit 96b17d51b831b62da66685444f97188e1af9ad7a
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit b94afc7866a362feb80b7e9a757a6cf2dbd78aa8
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

    * Squashed commit of the following:

    commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 27ab7369c986607ad08e356e3bd951864c845e22
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit c37504a11db9763a0cb65e1cfc9081d8e60aa0fc
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit cb7b75e6f96a9b933557c570bea72a12b7800014
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

        * Squashed commit of the following:

        commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 27ab7369c986607ad08e356e3bd951864c845e22
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '49e625761a6853595641a0a411c96168490dabad'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 49e625761a6853595641a0a411c96168490dabad
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811faca3743a9b0c865144145198cc5eea21393
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 15f168756d8f92f53dea87548efe606d0d1401b5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811faca3743a9b0c865144145198cc5eea21393
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit a853223fa8da0ec1d59040768c896c1526b10dff
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 6e7cd871ca881e5002bbaa3dd7774d34fce12811
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit efd3510236c5ca6948d65a7150fd7a5925902f3d
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '49e625761a6853595641a0a411c96168490dabad'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 49e625761a6853595641a0a411c96168490dabad
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit a853223fa8da0ec1d59040768c896c1526b10dff
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 15f168756d8f92f53dea87548efe606d0d1401b5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 49e625761a6853595641a0a411c96168490dabad
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811faca3743a9b0c865144145198cc5eea21393
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 15f168756d8f92f53dea87548efe606d0d1401b5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811faca3743a9b0c865144145198cc5eea21393
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit a853223fa8da0ec1d59040768c896c1526b10dff
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 7037fd2991af7afe522d9492878cde4b2699bc43
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '49e625761a6853595641a0a411c96168490dabad'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 49e625761a6853595641a0a411c96168490dabad
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit a853223fa8da0ec1d59040768c896c1526b10dff
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 15f168756d8f92f53dea87548efe606d0d1401b5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 6e7cd871ca881e5002bbaa3dd7774d34fce12811
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit efd3510236c5ca6948d65a7150fd7a5925902f3d
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '49e625761a6853595641a0a411c96168490dabad'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 49e625761a6853595641a0a411c96168490dabad
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 15f168756d8f92f53dea87548efe606d0d1401b5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit a853223fa8da0ec1d59040768c896c1526b10dff
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit 09d42b879158738f5484f31d514c6b400a418551
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit e8110aacf87bb0450db298b0993164765e0a624f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit e811faca3743a9b0c865144145198cc5eea21393
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit 15f168756d8f92f53dea87548efe606d0d1401b5
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit 09d42b879158738f5484f31d514c6b400a418551
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit e8110aacf87bb0450db298b0993164765e0a624f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit e811faca3743a9b0c865144145198cc5eea21393
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

* Squashed commit of the following:

commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit fbb7aa57856f800d6c18413318830f4bbc6c8157
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 7dd84f337cf1ce906dfeb92118e6c2998707a79a
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit a781057ad07b0a60c7ef682f864be598b2436b7c
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit fbb7aa57856f800d6c18413318830f4bbc6c8157
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit fbb7aa57856f800d6c18413318830f4bbc6c8157
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 04a4076120c4d337d70992b82bf2b4fa4c700359
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit b3c423a93d944a2621c1fa4192616af048e5b77c
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit c3b0da62994f646141456b60baaa3ee5713f38fa
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit fbb7aa57856f800d6c18413318830f4bbc6c8157
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 7dd84f337cf1ce906dfeb92118e6c2998707a79a
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit a781057ad07b0a60c7ef682f864be598b2436b7c
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

        * Squashed commit of the following:

        commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit fbb7aa57856f800d6c18413318830f4bbc6c8157
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit d8a4f8ef094e37c987863da971cbc51637b92b43
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0517eb5891710f2d7191ca7b650723701e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0517eb5891710f2d7191ca7b650723701e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 7eefb7e3bb827b0e784ed0395e4125c535b6eeef
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 81d7b9fdf3e662405e0ea358900a4c6981cc502f
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit d8a4f8ef094e37c987863da971cbc51637b92b43
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit d8a4f8ef094e37c987863da971cbc51637b92b43
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0517eb5891710f2d7191ca7b650723701e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0517eb5891710f2d7191ca7b650723701e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit ae76855543ee127e79809843378a18aa06d90261
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit d8a4f8ef094e37c987863da971cbc51637b92b43
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 7eefb7e3bb827b0e784ed0395e4125c535b6eeef
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit 81d7b9fdf3e662405e0ea358900a4c6981cc502f
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit d8a4f8ef094e37c987863da971cbc51637b92b43
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit dbe09071a986c68e6b2b60cbde501da8d498535f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit 844a47e5d49c71e5297decdf7510d8a1a214f934
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 89545d0517eb5891710f2d7191ca7b650723701e
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit 992be447a9fdf701fc910177653017e3978bf56d
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit dbe09071a986c68e6b2b60cbde501da8d498535f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit 844a47e5d49c71e5297decdf7510d8a1a214f934
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 89545d0517eb5891710f2d7191ca7b650723701e
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

* Squashed commit of the following:

commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit ebe4eb8dffcce06f7be393478d35d76de82a3836
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit f1258892713f588f8d65826f9141e38048f5ff31
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 6a4b81baa42b29457cbaea42043723c2332ad5ba
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit fab87047e683d9982ea0f544feb3e2fce4e1fbf4
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

    * Squashed commit of the following:

    commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit ebe4eb8dffcce06f7be393478d35d76de82a3836
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit f1258892713f588f8d65826f9141e38048f5ff31
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit ebe4eb8dffcce06f7be393478d35d76de82a3836
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit f1258892713f588f8d65826f9141e38048f5ff31
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 6e6fe00bf9d5fcfd351c164285c569e53f38e280
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit 938c7729a9176e459531cbd00bb6f8d69691258b
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 2412a0072cc8840593c90e5bdeff64aa8f375bdc
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

    * Squashed commit of the following:

    commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit ebe4eb8dffcce06f7be393478d35d76de82a3836
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit f1258892713f588f8d65826f9141e38048f5ff31
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 6a4b81baa42b29457cbaea42043723c2332ad5ba
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit fab87047e683d9982ea0f544feb3e2fce4e1fbf4
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

        * Squashed commit of the following:

        commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit ebe4eb8dffcce06f7be393478d35d76de82a3836
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit f1258892713f588f8d65826f9141e38048f5ff31
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

 …
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 4a1183c563835c366ea54a28e1a5761a193b6704
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 7664839d1765e09b06e6cf59c12cb895ef71c40e
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 05487a4e1f1dd1ab20d087399a47502716929a9b
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 4a1183c563835c366ea54a28e1a5761a193b6704
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit c09b621195878300417315a97efdec25e67dd7f5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 864a1aba26388276b7e57717b89520fcc77b3f62
    Merge: ab898e4 ad8d9da
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit c0ea54d49cb65b747d7e8fccac75838acabe05db
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 4a1183c563835c366ea54a28e1a5761a193b6704
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 1f8780df5e89ee50f349361bb5ea7351a73e0c19
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit c09b621195878300417315a97efdec25e67dd7f5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 864a1aba26388276b7e57717b89520fcc77b3f62
        Merge: ab898e4 ad8d9da
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit c0ea54d49cb65b747d7e8fccac75838acabe05db
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 4a1183c563835c366ea54a28e1a5761a193b6704
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 7664839d1765e09b06e6cf59c12cb895ef71c40e
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit 05487a4e1f1dd1ab20d087399a47502716929a9b
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '7b7f6368e8e04cddbd6e7f572f1099b7911cbe04'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 7b7f6368e8e04cddbd6e7f572f1099b7911cbe04
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit c09b621195878300417315a97efdec25e67dd7f5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 864a1aba26388276b7e57717b89520fcc77b3f62
            Merge: ab898e4 ad8d9da
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit c0ea54d49cb65b747d7e8fccac75838acabe05db
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 6ee856b61bbb0156dd72d454430cd01a246b5e61
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 4a1183c563835c366ea54a28e1a5761a193b6704
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit 'ad8d9da1fb40c446202bf9b0095b02262df2ffc8'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit c09b621195878300417315a97efdec25e67dd7f5
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 864a1aba26388276b7e57717b89520fcc77b3f62
                Merge: ab898e4 ad8d9da
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit c0ea54d49cb65b747d7e8fccac75838acabe05db
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit dbba2fe6447b0dfd4bb89a368f62178f2b253006
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit c09b621195878300417315a97efdec25e67dd7f5
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 864a1aba26388276b7e57717b89520fcc77b3f62
                Merge: ab898e4 ad8d9da
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit ab898e4fd30bf83888125d48b80bc86b01cb5d39
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit c0ea54d49cb65b747d7e8fccac75838acabe05db
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit ad8d9da1fb40c446202bf9b0095b02262df2ffc8
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

* Squashed commit of the following:

commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47afe9a61214f11053dae5641716052f30f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47afe9a61214f11053dae5641716052f30f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 9d06741f31439e6ac34764612664467239b63253
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 0d4e69f54d996672ab0471531837004f80ba9b10
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 2b01738ba36ee632712135d38f45ea40f1c1323a
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 2f61ad5c3da7411eccda597afadcb64d573c5193
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 9d06741f31439e6ac34764612664467239b63253
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 0d4e69f54d996672ab0471531837004f80ba9b10
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47afe9a61214f11053dae5641716052f30f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 0dc9a47afe9a61214f11053dae5641716052f30f
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 9d06741f31439e6ac34764612664467239b63253
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 0d4e69f54d996672ab0471531837004f80ba9b10
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 04a4076120c4d337d70992b82bf2b4fa4c700359
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit b3c423a93d944a2621c1fa4192616af048e5b77c
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 1c5354e09283b03f1c0068d39b82f8bfa73d4184
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 0dc9a47afe9a61214f11053dae5641716052f30f
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 9d06741f31439e6ac34764612664467239b63253
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 0d4e69f54d996672ab0471531837004f80ba9b10
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 2b01738ba36ee632712135d38f45ea40f1c1323a
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit 2f61ad5c3da7411eccda597afadcb64d573c5193
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29'

        * Squashed commit of the following:

        commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 1c9c7f95a6b03950c05f47216c7dbf4c4d3edd29
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 0dc9a47afe9a61214f11053dae5641716052f30f
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 9d06741f31439e6ac34764612664467239b63253
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 0d4e69f54d996672ab0471531837004f80ba9b10
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '0dc9a47afe9a61214f11053dae5641716052f30f'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 0dc9a47afe9a61214f11053dae5641716052f30f
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 0dc9a47afe9a61214f11053dae5641716052f30f
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 708de71d7c634c51ade4443f7a8590dca74561ed
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit af73a51ca7940095310f725544bd3473b67b412c
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit accfaffdc9ba3002757d1ee167063c7aa6a12394
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 708de71d7c634c51ade4443f7a8590dca74561ed
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 708de71d7c634c51ade4443f7a8590dca74561ed
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit f6f7adae7485defcca27deafb2b19b37733233c6
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 708de71d7c634c51ade4443f7a8590dca74561ed
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit af73a51ca7940095310f725544bd3473b67b412c
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit accfaffdc9ba3002757d1ee167063c7aa6a12394
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '708de71d7c634c51ade4443f7a8590dca74561ed'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 708de71d7c634c51ade4443f7a8590dca74561ed
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit e19ec39d72c2781f1f2d174094d3acfb4ada7861
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 1c5dbd5c7f65394a6395db59e97d148576a3ad20
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit dbe09071a986c68e6b2b60cbde501da8d498535f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit 844a47e5d49c71e5297decdf7510d8a1a214f934
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit 992be447a9fdf701fc910177653017e3978bf56d
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit dbe09071a986c68e6b2b60cbde501da8d498535f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit 844a47e5d49c71e5297decdf7510d8a1a214f934
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 5fb3e5d50de23f7f9f7bb10510e21ffb22c02adb
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

* Squashed commit of the following:

commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 27ab7369c986607ad08e356e3bd951864c845e22
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit c37504a11db9763a0cb65e1cfc9081d8e60aa0fc
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit cb7b75e6f96a9b933557c570bea72a12b7800014
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

    * Squashed commit of the following:

    commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 27ab7369c986607ad08e356e3bd951864c845e22
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 27ab7369c986607ad08e356e3bd951864c845e22
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit d887d8a25654322aa62cff6e94b39c262ebc8ae0
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit 96b17d51b831b62da66685444f97188e1af9ad7a
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit b94afc7866a362feb80b7e9a757a6cf2dbd78aa8
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

    * Squashed commit of the following:

    commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 27ab7369c986607ad08e356e3bd951864c845e22
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit c37504a11db9763a0cb65e1cfc9081d8e60aa0fc
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit cb7b75e6f96a9b933557c570bea72a12b7800014
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit 'c2050a435b47dfba638b6ba6a1600515a9f61b4c'

        * Squashed commit of the following:

        commit 55411a8236a6a4af45c9d3d73349d9308f1b11dd
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit c2050a435b47dfba638b6ba6a1600515a9f61b4c
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 27ab7369c986607ad08e356e3bd951864c845e22
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 1c11ae4aeecd3305e99f3baaa54d2c5914d6a6b7
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '4b30564ccba6af8112cd9fedf36a16bb6571b1d9'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 4b30564ccba6af8112cd9fedf36a16bb6571b1d9
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '49e625761a6853595641a0a411c96168490dabad'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 49e625761a6853595641a0a411c96168490dabad
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811faca3743a9b0c865144145198cc5eea21393
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 15f168756d8f92f53dea87548efe606d0d1401b5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811faca3743a9b0c865144145198cc5eea21393
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit a853223fa8da0ec1d59040768c896c1526b10dff
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 6e7cd871ca881e5002bbaa3dd7774d34fce12811
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit efd3510236c5ca6948d65a7150fd7a5925902f3d
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '49e625761a6853595641a0a411c96168490dabad'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 49e625761a6853595641a0a411c96168490dabad
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit a853223fa8da0ec1d59040768c896c1526b10dff
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 15f168756d8f92f53dea87548efe606d0d1401b5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 49e625761a6853595641a0a411c96168490dabad
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811faca3743a9b0c865144145198cc5eea21393
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 15f168756d8f92f53dea87548efe606d0d1401b5
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit 09d42b879158738f5484f31d514c6b400a418551
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit e8110aacf87bb0450db298b0993164765e0a624f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit e811faca3743a9b0c865144145198cc5eea21393
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit a853223fa8da0ec1d59040768c896c1526b10dff
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 7037fd2991af7afe522d9492878cde4b2699bc43
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '49e625761a6853595641a0a411c96168490dabad'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 49e625761a6853595641a0a411c96168490dabad
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 15f168756d8f92f53dea87548efe606d0d1401b5
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit 09d42b879158738f5484f31d514c6b400a418551
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit e8110aacf87bb0450db298b0993164765e0a624f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit e811faca3743a9b0c865144145198cc5eea21393
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit a853223fa8da0ec1d59040768c896c1526b10dff
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 15f168756d8f92f53dea87548efe606d0d1401b5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 6e7cd871ca881e5002bbaa3dd7774d34fce12811
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit efd3510236c5ca6948d65a7150fd7a5925902f3d
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '49e625761a6853595641a0a411c96168490dabad'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 49e625761a6853595641a0a411c96168490dabad
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 15f168756d8f92f53dea87548efe606d0d1401b5
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit 09d42b879158738f5484f31d514c6b400a418551
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit e8110aacf87bb0450db298b0993164765e0a624f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit e811faca3743a9b0c865144145198cc5eea21393
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit da7a8df0ec859a7e69bf0ace845f00ff3717ac75
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit a853223fa8da0ec1d59040768c896c1526b10dff
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit 'e811faca3743a9b0c865144145198cc5eea21393'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit 09d42b879158738f5484f31d514c6b400a418551
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit e8110aacf87bb0450db298b0993164765e0a624f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit e811faca3743a9b0c865144145198cc5eea21393
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit 15f168756d8f92f53dea87548efe606d0d1401b5
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit 290c53c0ea60868d2f0fb31bee1ac8d213b08d36
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 27bc5c84f9d9f2ff56b2adfa69d23894f4027100
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit 09d42b879158738f5484f31d514c6b400a418551
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit e8110aacf87bb0450db298b0993164765e0a624f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit e811faca3743a9b0c865144145198cc5eea21393
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

* Squashed commit of the following:

commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit fbb7aa57856f800d6c18413318830f4bbc6c8157
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 7dd84f337cf1ce906dfeb92118e6c2998707a79a
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit a781057ad07b0a60c7ef682f864be598b2436b7c
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit fbb7aa57856f800d6c18413318830f4bbc6c8157
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit fbb7aa57856f800d6c18413318830f4bbc6c8157
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 04a4076120c4d337d70992b82bf2b4fa4c700359
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit b3c423a93d944a2621c1fa4192616af048e5b77c
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit c3b0da62994f646141456b60baaa3ee5713f38fa
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

    * Squashed commit of the following:

    commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit fbb7aa57856f800d6c18413318830f4bbc6c8157
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 7dd84f337cf1ce906dfeb92118e6c2998707a79a
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit a781057ad07b0a60c7ef682f864be598b2436b7c
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit '6d570ac1d98a03585c8119ccb362e13ab2172fed'

        * Squashed commit of the following:

        commit 09c64b7491cd19d4e6c4a6e1a38254eaa74d0032
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit 6d570ac1d98a03585c8119ccb362e13ab2172fed
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit fbb7aa57856f800d6c18413318830f4bbc6c8157
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit b8ba33c2a349cb5b479e14af1a2d30f15ad53010
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit 'f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit f92c3d6d10a8b0b7a0b42baa60cb364b99525b4e
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

* Squashed commit of the following:

commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit d8a4f8ef094e37c987863da971cbc51637b92b43
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0517eb5891710f2d7191ca7b650723701e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0517eb5891710f2d7191ca7b650723701e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 7eefb7e3bb827b0e784ed0395e4125c535b6eeef
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit 81d7b9fdf3e662405e0ea358900a4c6981cc502f
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit d8a4f8ef094e37c987863da971cbc51637b92b43
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit d8a4f8ef094e37c987863da971cbc51637b92b43
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0517eb5891710f2d7191ca7b650723701e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit 992be447a9fdf701fc910177653017e3978bf56d
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit dbe09071a986c68e6b2b60cbde501da8d498535f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit 844a47e5d49c71e5297decdf7510d8a1a214f934
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 89545d0517eb5891710f2d7191ca7b650723701e
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 380494bb2417fae1bcc1535ad8b67df7af667619
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit e46b937aeeed45f5dd574b852459bfb416d165fd
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit ae76855543ee127e79809843378a18aa06d90261
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

    * Squashed commit of the following:

    commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit d8a4f8ef094e37c987863da971cbc51637b92b43
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit 992be447a9fdf701fc910177653017e3978bf56d
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit dbe09071a986c68e6b2b60cbde501da8d498535f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit 844a47e5d49c71e5297decdf7510d8a1a214f934
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 89545d0517eb5891710f2d7191ca7b650723701e
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 7eefb7e3bb827b0e784ed0395e4125c535b6eeef
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit 81d7b9fdf3e662405e0ea358900a4c6981cc502f
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit 'd8a4f8ef094e37c987863da971cbc51637b92b43'

        * Squashed commit of the following:

        commit 96d95b3cb3540cd17bcab31f1a85ad0d04a12f1e
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit d8a4f8ef094e37c987863da971cbc51637b92b43
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit 992be447a9fdf701fc910177653017e3978bf56d
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit dbe09071a986c68e6b2b60cbde501da8d498535f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit 844a47e5d49c71e5297decdf7510d8a1a214f934
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 89545d0517eb5891710f2d7191ca7b650723701e
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit a2b4a2a27d6f6f712e5214bb3bb55c0a679b9499
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit 47a6675ce97fc0e0732c195258e6c29f3b3ff275
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '89545d0517eb5891710f2d7191ca7b650723701e'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit dbe09071a986c68e6b2b60cbde501da8d498535f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit 844a47e5d49c71e5297decdf7510d8a1a214f934
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 89545d0517eb5891710f2d7191ca7b650723701e
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit 992be447a9fdf701fc910177653017e3978bf56d
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit baf78ea27df4dfe5d88bc2abca707e117a4f9661
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit e323545d9f3a5e0f2219618a4b024aea3ff6e353
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit dbe09071a986c68e6b2b60cbde501da8d498535f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit 844a47e5d49c71e5297decdf7510d8a1a214f934
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 89545d0517eb5891710f2d7191ca7b650723701e
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
* add fuyu

* Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

* Squashed commit of the following:

commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
Author: kcz358 <[email protected]>
Date:   Tue Jan 30 19:39:57 2024 +0800

    Add hallu bench

commit ebe4eb8dffcce06f7be393478d35d76de82a3836
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit f1258892713f588f8d65826f9141e38048f5ff31
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update hb_doc_to_text function to remove unnecessary line break

* Add Fuyu model and update OtterHD model

* Refactor model response handling and fix image processing bug

* Refactor flatten method to support only getting the first element

* Add support for specifying timezone in datetime string

Update flatten method in OtterHD class

Update get_datetime_str function in utils.py

* Fix condition for checking wandb_args_dict in __main__.py

* Commented out assertions for batch size in Fuyu model

* Add warning message for existing output file

* Fix batch size issue in OtterHD model

* Squashed commit of the following:

commit 6a4b81baa42b29457cbaea42043723c2332ad5ba
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 16:00:22 2024 +0800

    [Datasets] add hallubench (#34)

    * Add hallu bench

    * Fix hall_b gpt eval bugs

    ---------

    Co-authored-by: kcz358 <[email protected]>

commit fab87047e683d9982ea0f544feb3e2fce4e1fbf4
Author: Li Bo <[email protected]>
Date:   Wed Jan 31 14:23:15 2024 +0800

    [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

    * add fuyu

    * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

    * Squashed commit of the following:

    commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit ebe4eb8dffcce06f7be393478d35d76de82a3836
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit f1258892713f588f8d65826f9141e38048f5ff31
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

commit ebe4eb8dffcce06f7be393478d35d76de82a3836
Author: Pu Fanyi <[email protected]>
Date:   Tue Jan 30 14:52:51 2024 +0800

    scienceqa for full set (#32)

    * Remove unused code and configuration file

    * Remove docvqa.yaml and update vizwizvqa.yaml

    * lint

    * Add dataset_kwargs to vizwizvqa.yaml

    * Add dataset_kwargs to vizwizvqa.yaml

    * textvqa (#27)

    * Update textvqa.yaml and utils.py

    * Fix YAML formatting in textvqa.yaml and remove unused files

    * remove useless matric

    * add textvqa val & test

    * Update progress bar description in evaluator.py

    * Update submission file names in VizWizVQA tasks

    * Update output path to include log samples suffix

    * Update submission file paths in OKVQA and VizWizVQA tasks

    * Refactor llava-in-the-wild.yaml and utils.py

    * Update metric for llava evaluation

    * Refactor logging message in Task class

    * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

    * Fix formatting issues and add progress bar closing statements

    * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

    * Update tqdm progress bar in OtterHD model

    * Squashed commit of the following:

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * Fix error handling in loading YAML config files

    * Squashed commit of the following:

    commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 12:41:40 2024 +0800

        Fix key bugs

    commit eae210c3700a59b7d5cc9de46fcb855f443096aa
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:46:19 2024 +0800

        Black lint

    commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
    Merge: ab898e4 fb209e4
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:45:31 2024 +0800

        Merge branch 'main' into kc/list_tasks_num

    commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:44:23 2024 +0800

        Enable list all tasks num

    commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
    Author: kcz358 <[email protected]>
    Date:   Sun Jan 28 09:41:32 2024 +0800

        Exclude train yaml file in the task list

    commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
    Author: Zhang Peiyuan <[email protected]>
    Date:   Sun Jan 28 02:04:57 2024 +0800

        Add InfoVQA, DocVQA, and QwenVL (#28)

        * add mmme

        * black

        * add model specific prompt and gen kwargs

        * black

        * add yaml config to supprot multi-model eval

        * print table at the end

        * refactor multi model code

        * add chartqa

        * black

        * add ai2d

        * black

        * update chartqa

        * blacl

        * update ai2d dataset

        * black

        * add qwenvl

        * add infovqa and docvqa

    * List task #num sorted

    * Update prompt messages for image-related tasks

    * Delete unused task configuration files

    * Remove coco_train.yaml configuration file

    * Update task name in mmmu.yaml

    * Fix error message for missing tasks

    * Add wandb import and integration

    * Update generation kwargs for LMMS tasks

    * Update lmms_eval MME task configuration and utils

    * Update generation_kwargs in lmms_eval tasks

    * Update doc_to_text function in coco and okvqa tasks

    * Add COCO 2017 version

    * Update task name in coco_test2017.yaml

    * Squashed commit of the following:

    commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
    Author: Zhang Peiyuan <[email protected]>
    Date:   Mon Jan 29 22:41:33 2024 +0800

        Add/mmmu test (#30)

        * mmmu_test

        * black

    commit f1258892713f588f8d65826f9141e38048f5ff31
    Author: Li Bo <[email protected]>
    Date:   Sun Jan 28 22:19:13 2024 +0800

        [Dataset Check] dataset check and add wandb logging (#29)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        ---------

        Co-authored-by: Fanyi Pu <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Remove scienceqa_img task configuration

    * eval scienceqa with no images

    ---------

    Co-authored-by: Bo Li <[email protected]>
    Co-authored-by: kcz358 <[email protected]>

* Update API configuration and file paths

* Refactor evaluate_by_chatgpt function in utils.py

* Add hallusion_output_vd_model.json to .gitignore

* Add timeout to API request

* Refactor file path generation and remove unnecessary suffix in log samples output names

* Refactor code and add output path handling

* Update lmms-eval API and add new models and datasets

* Refactor directory structure for RefCOCO+ and RefCOCOg datasets

* Fix error logging in get_eval and parse_score functions

* Update .gitignore and mme.yaml

* Squashed commit of the following:

commit 6e6fe00bf9d5fcfd351c164285c569e53f38e280
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:43:28 2024 +0800

    black

commit 938c7729a9176e459531cbd00bb6f8d69691258b
Author: jzhang38 <[email protected]>
Date:   Fri Feb 2 13:42:03 2024 +0800

    adapt qwen to sqa, gqa, ai2d, docvqa

commit 2412a0072cc8840593c90e5bdeff64aa8f375bdc
Author: Li Bo <[email protected]>
Date:   Thu Feb 1 16:20:27 2024 +0800

    [Dataset] fix hallusion benchmark, add saving logic inside aggregate function (#35)

    * add fuyu

    * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

    * Squashed commit of the following:

    commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
    Author: kcz358 <[email protected]>
    Date:   Tue Jan 30 19:39:57 2024 +0800

        Add hallu bench

    commit ebe4eb8dffcce06f7be393478d35d76de82a3836
    Author: Pu Fanyi <[email protected]>
    Date:   Tue Jan 30 14:52:51 2024 +0800

        scienceqa for full set (#32)

        * Remove unused code and configuration file

        * Remove docvqa.yaml and update vizwizvqa.yaml

        * lint

        * Add dataset_kwargs to vizwizvqa.yaml

        * Add dataset_kwargs to vizwizvqa.yaml

        * textvqa (#27)

        * Update textvqa.yaml and utils.py

        * Fix YAML formatting in textvqa.yaml and remove unused files

        * remove useless matric

        * add textvqa val & test

        * Update progress bar description in evaluator.py

        * Update submission file names in VizWizVQA tasks

        * Update output path to include log samples suffix

        * Update submission file paths in OKVQA and VizWizVQA tasks

        * Refactor llava-in-the-wild.yaml and utils.py

        * Update metric for llava evaluation

        * Refactor logging message in Task class

        * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

        * Fix formatting issues and add progress bar closing statements

        * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

        * Update tqdm progress bar in OtterHD model

        * Squashed commit of the following:

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * Fix error handling in loading YAML config files

        * Squashed commit of the following:

        commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 12:41:40 2024 +0800

            Fix key bugs

        commit eae210c3700a59b7d5cc9de46fcb855f443096aa
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:46:19 2024 +0800

            Black lint

        commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
        Merge: ab898e4 fb209e4
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:45:31 2024 +0800

            Merge branch 'main' into kc/list_tasks_num

        commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:44:23 2024 +0800

            Enable list all tasks num

        commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
        Author: kcz358 <[email protected]>
        Date:   Sun Jan 28 09:41:32 2024 +0800

            Exclude train yaml file in the task list

        commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
        Author: Zhang Peiyuan <[email protected]>
        Date:   Sun Jan 28 02:04:57 2024 +0800

            Add InfoVQA, DocVQA, and QwenVL (#28)

            * add mmme

            * black

            * add model specific prompt and gen kwargs

            * black

            * add yaml config to supprot multi-model eval

            * print table at the end

            * refactor multi model code

            * add chartqa

            * black

            * add ai2d

            * black

            * update chartqa

            * blacl

            * update ai2d dataset

            * black

            * add qwenvl

            * add infovqa and docvqa

        * List task #num sorted

        * Update prompt messages for image-related tasks

        * Delete unused task configuration files

        * Remove coco_train.yaml configuration file

        * Update task name in mmmu.yaml

        * Fix error message for missing tasks

        * Add wandb import and integration

        * Update generation kwargs for LMMS tasks

        * Update lmms_eval MME task configuration and utils

        * Update generation_kwargs in lmms_eval tasks

        * Update doc_to_text function in coco and okvqa tasks

        * Add COCO 2017 version

        * Update task name in coco_test2017.yaml

        * Squashed commit of the following:

        commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
        Author: Zhang Peiyuan <[email protected]>
        Date:   Mon Jan 29 22:41:33 2024 +0800

            Add/mmmu test (#30)

            * mmmu_test

            * black

        commit f1258892713f588f8d65826f9141e38048f5ff31
        Author: Li Bo <[email protected]>
        Date:   Sun Jan 28 22:19:13 2024 +0800

            [Dataset Check] dataset check and add wandb logging (#29)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            ---------

            Co-authored-by: Fanyi Pu <[email protected]>
            Co-authored-by: kcz358 <[email protected]>

        * Remove scienceqa_img task configuration

        * eval scienceqa with no images

        ---------

        Co-authored-by: Bo Li <[email protected]>
        Co-authored-by: kcz358 <[email protected]>

    * Update hb_doc_to_text function to remove unnecessary line break

    * Add Fuyu model and update OtterHD model

    * Refactor model response handling and fix image processing bug

    * Refactor flatten method to support only getting the first element

    * Add support for specifying timezone in datetime string

    Update flatten method in OtterHD class

    Update get_datetime_str function in utils.py

    * Fix condition for checking wandb_args_dict in __main__.py

    * Commented out assertions for batch size in Fuyu model

    * Add warning message for existing output file

    * Fix batch size issue in OtterHD model

    * Squashed commit of the following:

    commit 6a4b81baa42b29457cbaea42043723c2332ad5ba
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 16:00:22 2024 +0800

        [Datasets] add hallubench (#34)

        * Add hallu bench

        * Fix hall_b gpt eval bugs

        ---------

        Co-authored-by: kcz358 <[email protected]>

    commit fab87047e683d9982ea0f544feb3e2fce4e1fbf4
    Author: Li Bo <[email protected]>
    Date:   Wed Jan 31 14:23:15 2024 +0800

        [Datasets & Models] Fuyu, HalluBench (w/Kaichen, commit 96d95b3) (#33)

        * add fuyu

        * Merge commit 'ebe4eb8dffcce06f7be393478d35d76de82a3836'

        * Squashed commit of the following:

        commit 72ce63c90098fa7a7364f7a1113ce4b3b23b981a
        Author: kcz358 <[email protected]>
        Date:   Tue Jan 30 19:39:57 2024 +0800

            Add hallu bench

        commit ebe4eb8dffcce06f7be393478d35d76de82a3836
        Author: Pu Fanyi <[email protected]>
        Date:   Tue Jan 30 14:52:51 2024 +0800

            scienceqa for full set (#32)

            * Remove unused code and configuration file

            * Remove docvqa.yaml and update vizwizvqa.yaml

            * lint

            * Add dataset_kwargs to vizwizvqa.yaml

            * Add dataset_kwargs to vizwizvqa.yaml

            * textvqa (#27)

            * Update textvqa.yaml and utils.py

            * Fix YAML formatting in textvqa.yaml and remove unused files

            * remove useless matric

            * add textvqa val & test

            * Update progress bar description in evaluator.py

            * Update submission file names in VizWizVQA tasks

            * Update output path to include log samples suffix

            * Update submission file paths in OKVQA and VizWizVQA tasks

            * Refactor llava-in-the-wild.yaml and utils.py

            * Update metric for llava evaluation

            * Refactor logging message in Task class

            * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

            * Fix formatting issues and add progress bar closing statements

            * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

            * Update tqdm progress bar in OtterHD model

            * Squashed commit of the following:

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * Fix error handling in loading YAML config files

            * Squashed commit of the following:

            commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 12:41:40 2024 +0800

                Fix key bugs

            commit eae210c3700a59b7d5cc9de46fcb855f443096aa
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:46:19 2024 +0800

                Black lint

            commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
            Merge: ab898e4 fb209e4
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:45:31 2024 +0800

                Merge branch 'main' into kc/list_tasks_num

            commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:44:23 2024 +0800

                Enable list all tasks num

            commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
            Author: kcz358 <[email protected]>
            Date:   Sun Jan 28 09:41:32 2024 +0800

                Exclude train yaml file in the task list

            commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
            Author: Zhang Peiyuan <[email protected]>
            Date:   Sun Jan 28 02:04:57 2024 +0800

                Add InfoVQA, DocVQA, and QwenVL (#28)

                * add mmme

                * black

                * add model specific prompt and gen kwargs

                * black

                * add yaml config to supprot multi-model eval

                * print table at the end

                * refactor multi model code

                * add chartqa

                * black

                * add ai2d

                * black

                * update chartqa

                * blacl

                * update ai2d dataset

                * black

                * add qwenvl

                * add infovqa and docvqa

            * List task #num sorted

            * Update prompt messages for image-related tasks

            * Delete unused task configuration files

            * Remove coco_train.yaml configuration file

            * Update task name in mmmu.yaml

            * Fix error message for missing tasks

            * Add wandb import and integration

            * Update generation kwargs for LMMS tasks

            * Update lmms_eval MME task configuration and utils

            * Update generation_kwargs in lmms_eval tasks

            * Update doc_to_text function in coco and okvqa tasks

            * Add COCO 2017 version

            * Update task name in coco_test2017.yaml

            * Squashed commit of the following:

            commit 0fd45585aecf41e04bb6510cf09c0b829bd0f49d
            Author: Zhang Peiyuan <[email protected]>
            Date:   Mon Jan 29 22:41:33 2024 +0800

                Add/mmmu test (#30)

                * mmmu_test

                * black

            commit f1258892713f588f8d65826f9141e38048f5ff31
            Author: Li Bo <[email protected]>
            Date:   Sun Jan 28 22:19:13 2024 +0800

                [Dataset Check] dataset check and add wandb logging (#29)

                * Remove unused code and configuration file

                * Remove docvqa.yaml and update vizwizvqa.yaml

                * lint

                * Add dataset_kwargs to vizwizvqa.yaml

                * Add dataset_kwargs to vizwizvqa.yaml

                * textvqa (#27)

                * Update textvqa.yaml and utils.py

                * Fix YAML formatting in textvqa.yaml and remove unused files

                * remove useless matric

                * add textvqa val & test

                * Update progress bar description in evaluator.py

                * Update submission file names in VizWizVQA tasks

                * Update output path to include log samples suffix

                * Update submission file paths in OKVQA and VizWizVQA tasks

                * Refactor llava-in-the-wild.yaml and utils.py

                * Update metric for llava evaluation

                * Refactor logging message in Task class

                * Merge commit '5553d106e5ffd84b280b3d5a3c8d47c35e2d310b'

                * Fix formatting issues and add progress bar closing statements

                * Update task from "infovqa_val" to "infovqa_test" in infovqa_test.yaml

                * Update tqdm progress bar in OtterHD model

                * Squashed commit of the following:

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * Fix error handling in loading YAML config files

                * Squashed commit of the following:

                commit fdb0c6785b0c5d6979d10e7ddf75ce9055038db8
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 12:41:40 2024 +0800

                    Fix key bugs

                commit eae210c3700a59b7d5cc9de46fcb855f443096aa
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:46:19 2024 +0800

                    Black lint

                commit 18e4a19e82357352ab25df77b5ae4f1b011d61ae
                Merge: ab898e4 fb209e4
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:45:31 2024 +0800

                    Merge branch 'main' into kc/list_tasks_num

                commit e899be48f55f95172fdf96bd2a98d3b91ff2aaed
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:44:23 2024 +0800

                    Enable list all tasks num

                commit a999fc6889c6986c28ec5d95460a4ab5233e5d4f
                Author: kcz358 <[email protected]>
                Date:   Sun Jan 28 09:41:32 2024 +0800

                    Exclude train yaml file in the task list

                commit 5553d106e5ffd84b280b3d5a3c8d47c35e2d310b
                Author: Zhang Peiyuan <[email protected]>
                Date:   Sun Jan 28 02:04:57 2024 +0800

                    Add InfoVQA, DocVQA, and QwenVL (#28)

                    * add mmme

                    * black

                    * add model specific prompt and gen kwargs

                    * black

                    * add yaml config to supprot multi-model eval

                    * print table at the end

                    * refactor multi model code

                    * add chartqa

                    * black

                    * add ai2d

                    * black

                    * update chartqa

                    * blacl

                    * update ai2d dataset

                    * black

                    * add qwenvl

                    * add infovqa and docvqa

                * List task #num sorted

                * Update prompt messages for image-related tasks

                * Delete unused task configuration files

                * Remove coco_train.yaml configuration file

                * Update task name in mmmu.yaml

                * Fix error message for missing tasks

                * Add wandb import and integration

                ---------

                Co…
kangreen0210 pushed a commit to kangreen0210/LIME that referenced this pull request Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants