Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用python代码如何统一指定离线data目录? #201

Open
charliedream1 opened this issue Nov 13, 2024 · 0 comments
Open

使用python代码如何统一指定离线data目录? #201

charliedream1 opened this issue Nov 13, 2024 · 0 comments
Labels
documentation Improvements or additions to documentation

Comments

@charliedream1
Copy link

下面这个代码,我想测试10个测试集,然后统一设定下载的data目录,该如何指定?

# Copyright (c) Alibaba, Inc. and its affiliates.

"""
1. Installation
EvalScope: pip install evalscope[opencompass]

2. Download dataset to data/ folder
wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
unzip OpenCompassData-core-20240207.zip

3. Deploy model serving
    swift deploy --model_type qwen2-1_5b-instruct

4. Run eval task
"""
from evalscope.backend.opencompass import OpenCompassBackendManager
from evalscope.run import run_task
from evalscope.summarizer import Summarizer


def run_swift_eval():

    # List all datasets
    # e.g.  ['mmlu', 'WSC', 'DRCD', 'chid', 'gsm8k', 'AX_g', 'BoolQ', 'cmnli', 'ARC_e', 'ocnli_fc', 'summedits', 'MultiRC', 'GaokaoBench', 'obqa', 'math', 'agieval', 'hellaswag', 'RTE', 'race', 'ocnli', 'strategyqa', 'triviaqa', 'WiC', 'COPA', 'piqa', 'nq', 'mbpp', 'csl', 'Xsum', 'CB', 'tnews', 'ARC_c', 'afqmc', 'eprstmt', 'ReCoRD', 'bbh', 'CMRC', 'AX_b', 'siqa', 'storycloze', 'humaneval', 'cluewsc', 'winogrande', 'lambada', 'ceval', 'bustm', 'C3', 'lcsts']
    print(
        f"** All datasets from OpenCompass backend: {OpenCompassBackendManager.list_datasets()}"
    )

    # Prepare the config
    """
    Attributes:
        `eval_backend`: Default to 'OpenCompass'
        `datasets`: list, refer to `OpenCompassBackendManager.list_datasets()`
        `models`: list of dict, each dict must contain `path` and `openai_api_base` 
                `path`: reuse the value of '--model_type' in the command line `swift deploy`
                `openai_api_base`: the base URL of swift model serving
        `work_dir`: str, the directory to save the evaluation results、logs and summaries. Default to 'outputs/default'
                
        Refer to `opencompass.cli.arguments.ApiModelConfig` for other optional attributes.
    """
    # Option 1: Use dict format
    # Args:
    #   path: The path of the model, it means the `model_type` for swift, e.g. 'llama3-8b-instruct'
    #   is_chat: True for chat model, False for base model
    #   key: The OpenAI api-key of the model api, default to 'EMPTY'
    #   openai_api_base: The base URL of the OpenAI API, it means the swift model serving URL.
    task_cfg = dict(
        eval_backend="OpenCompass",
        eval_config={
            "datasets": ["winogrande"],
            "models": [
                {
                    "path": "qwen2-7b-instruct",  # Please make sure the model is deployed
                    "openai_api_base": "http://127.0.0.1:8000/v1/chat/completions",
                    "is_chat": True,
                    "batch_size": 16,
                },
            ],
            "work_dir": "outputs/qwen2_eval_result",
            "limit": 10,
        },
    )

    # Option 2: Use yaml file
    # task_cfg = 'examples/tasks/default_eval_swift_openai_api.yaml'

    # Option 3: Use json file
    # task_cfg = 'examples/tasks/default_eval_swift_openai_api.json'

    # Run task
    run_task(task_cfg=task_cfg)

    # [Optional] Get the final report with summarizer
    print(">> Start to get the report with summarizer ...")
    report_list = Summarizer.get_report_from_cfg(task_cfg)
    print(f"\n>>The report list: {report_list}")


if __name__ == "__main__":
    run_swift_eval()
@Yunnglin Yunnglin added the documentation Improvements or additions to documentation label Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants