-
Notifications
You must be signed in to change notification settings - Fork 527
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description of the bug
“ info = {} # Initialize info as empty dict, as StepOutput doesn't explicitly return it
print(f"[Agent._run_single_rollout][{task_idx}][Turn {t+1}] Env Step Result: Reward={reward}, Done={done}, Info={info}")
# Store the reward from this specific step
step_rewards.append(reward)
final_reward = reward # Keep track of the reward from the last executed step
final_env_score = info.get('score', 0.0) # Use .get for safety”
根据这几行代码,final_env_score在获取info分数时候,是不是永远等于0,因为永远把info={}了
Steps To Reproduce
xx
Additional Information
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working