-
Notifications
You must be signed in to change notification settings - Fork 392
Open
Description
Hi, I'm running an evaluation using a script that specifies multiple tasks, but I noticed the following behavior:
After successfully completing the first task, no results are saved to the output_path
, and if I interrupt the run (e.g., during the second task), restarting the job re-runs the first task again instead of resuming from where it left off.
Is this expected behavior? It seems that results are only written after all tasks finish, rather than being saved incrementally per task.
It would be very helpful if:
- Results were saved immediately after each task was completed.
- The evaluator could skip already-completed tasks on restart (e.g., via a
--resume
flag or based on existing output files).
The scripts I use:
#!/bin/bash
MOUNT_DIR="/root"
CONDA_PATH=/root/miniconda3
CONDA_ENV_NAME=psp-lmms-eval
MODEL_DIR="${MOUNT_DIR}/save_models/opensource/VisionThink/VisionThink-Efficient"
MODEL_NAME="VisionThink-Efficient"
MODEL_CLASS="visionthink_vllm_tool"
BATCH_SIZE=1024
GPU_LIST="4,5,6,7" # 字注意力头为 28 个,所以只能 4 卡不能 8 卡
LOG_SAMPLES_SUFFIX="vllm" # Specify a suffix for the log_samples file name
TASKS="mmbench_en_dev,pope,realworldqa,mme,mathvista_testmini,mathverse_testmini_vision_only,mmvet"
# 计算 GPU_LIST 的数量
TENSOR_PARALLEL_SIZE=$(echo $GPU_LIST | awk -F',' '{print NF}')
# shellcheck disable=SC1091
if ! { source "${CONDA_PATH}/bin/activate" && eval "$(conda shell.bash hook)" && conda activate $CONDA_ENV_NAME; }; then
exit 1
fi
echo "成功激活 Conda 环境: ${CONDA_ENV_NAME}"
LMMS_EVAL_DATASET_CACHE="${MOUNT_DIR}/dataset/opensource/lmms_eval"
VLLM_CACHE_ROOT="${MOUNT_DIR}/save_models/vllm_cache"
PROJECT_DIR="${MOUNT_DIR}/opensource/lmms-eval"
OUTPUT_PATH="${PROJECT_DIR}/eval_outputs/${MODEL_NAME}"
EVAL_MODEL_NAME="Qwen2.5-VL-72B-Instruct"
# EVAL_MODEL_NAME="api_doubao_Doubao-Seed-1.6-250615_nothink"
API_TYPE="utools_api"
CUR_TIME=$(date +%Y%m%d_%H%M%S)
LOG_FILE="${OUTPUT_PATH}/logs/${CUR_TIME}.log"
mkdir -p "$(dirname "$LOG_FILE")"
echo "日志文件: $LOG_FILE"
# 环境变量设置
export http_proxy=""
export https_proxy=""
export HF_DATASETS_OFFLINE=1
export HF_HUB_OFFLINE=1
# self._cache_dir = os.path.join(LMMS_EVAL_HOME, "eval_cache", cache_hash)
export LMMS_EVAL_HOME="${PROJECT_DIR}"
export HF_HOME="$LMMS_EVAL_DATASET_CACHE"
export VLLM_CACHE_ROOT="$VLLM_CACHE_ROOT"
export VLLM_WORKER_MULTIPROC_METHOD="spawn"
export HF_TOKEN="$HF_TOKEN"
export LMMS_EVAL_USE_CACHE=True
# evaluate 环境变量
export EVAL_MODEL_NAME="$EVAL_MODEL_NAME"
export API_TYPE="$API_TYPE"
CUDA_VISIBLE_DEVICES=$GPU_LIST python -m lmms_eval \
--model "$MODEL_CLASS" \
--model_args "model_version=${MODEL_DIR},tensor_parallel_size=${TENSOR_PARALLEL_SIZE},\
trust_remote_code=True,max_images=2,prompt=tool_call,enable_tool_call=True,\
downsample_image=True,max_token=40960" \
--tasks "${TASKS}" \
--batch_size "${BATCH_SIZE}" \
--log_samples \
--log_samples_suffix "${LOG_SAMPLES_SUFFIX}" \
--cache_requests "true" \
--output_path "${OUTPUT_PATH}" \
--verbosity DEBUG \
--seed 42 | tee "${LOG_FILE}"
Thanks!
Metadata
Metadata
Assignees
Labels
No labels