Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error at loading pretrain weights #9

Open
zhhiyuan opened this issue Jun 6, 2024 · 0 comments
Open

Error at loading pretrain weights #9

zhhiyuan opened this issue Jun 6, 2024 · 0 comments

Comments

@zhhiyuan
Copy link

zhhiyuan commented Jun 6, 2024

Thank you for your excellent work. I encountered an error regarding a missing configuration file when evaluating the hm3d dataset with your pretrained model using script objnav-eval-v2-hm3d.sh

Traceback (most recent call last):
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/run.py", line 90, in
main()
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/run.py", line 38, in main
run_exp(**vars(args))
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/run.py", line 85, in run_exp
config = get_config(exp_config, opts)
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/zson/config.py", line 259, in get_config
config.TASK_CONFIG = get_task_config(config.BASE_TASK_CONFIG_PATH)
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/zson/config.py", line 155, in get_task_config
config.merge_from_file(config_path)
File "/home/zhangzhiyuan/miniconda3/envs/zson/lib/python3.7/site-packages/yacs/config.py", line 211, in merge_from_file
with open(cfg_filename, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'configs/tasks/pointnav.yaml'

After modifying the configuration file path as configs/tasks/objectnav_v1.yaml, the model reported an error that the checkpoints’ weights and the model’s weights are inconsistent.

Traceback (most recent call last):
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/run.py", line 90, in
main()
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/run.py", line 38, in main
run_exp(**vars(args))
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/run.py", line 86, in run_exp
execute_exp(config, run_type)
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/run.py", line 71, in execute_exp
trainer.eval()
File "/home/mdisk1/heqisheng/embody/navigation/zson/habitat-lab-challenge-2022/habitat_baselines/common/base_trainer.py", line 112, in eval
checkpoint_index=ckpt_idx,
File "/home/mdisk1/heqisheng/embody/navigation/zson/zson/zson/trainer.py", line 179, in _eval_checkpoint
msg = self.agent.load_state_dict(ckpt_dict["state_dict"], strict=False)
File "/home/zhangzhiyuan/miniconda3/envs/zson/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1672, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ZSON_PPO:
size mismatch for actor_critic.net.state_encoder.rnn.weight_ih_l0: copying a param with shape torch.Size([2048, 1568]) from checkpoint, the shape in current model is torch.Size([1536, 1568]).
size mismatch for actor_critic.net.state_encoder.rnn.weight_hh_l0: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([1536, 512]).
size mismatch for actor_critic.net.state_encoder.rnn.bias_ih_l0: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for actor_critic.net.state_encoder.rnn.bias_hh_l0: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for actor_critic.net.state_encoder.rnn.weight_ih_l1: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([1536, 512]).
size mismatch for actor_critic.net.state_encoder.rnn.weight_hh_l1: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([1536, 512]).
size mismatch for actor_critic.net.state_encoder.rnn.bias_ih_l1: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for actor_critic.net.state_encoder.rnn.bias_hh_l1: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1536]).
Exception ignored in: <function VectorEnv.del at 0x7fa9f2275050>
Traceback (most recent call last):
File "/home/mdisk1/heqisheng/embody/navigation/zson/habitat-lab-challenge-2022/habitat/core/vector_env.py", line 592, in del
self.close()
File "/home/mdisk1/heqisheng/embody/navigation/zson/habitat-lab-challenge-2022/habitat/core/vector_env.py", line 463, in close
write_fn((CLOSE_COMMAND, None))
File "/home/mdisk1/heqisheng/embody/navigation/zson/habitat-lab-challenge-2022/habitat/core/vector_env.py", line 118, in call
self.write_fn(data)
File "/home/mdisk1/heqisheng/embody/navigation/zson/habitat-lab-challenge-2022/habitat/utils/pickle5_multiprocessing.py", line 62, in send
self.send_bytes(buf.getvalue())
File "/home/zhangzhiyuan/miniconda3/envs/zson/lib/python3.7/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/zhangzhiyuan/miniconda3/envs/zson/lib/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/zhangzhiyuan/miniconda3/envs/zson/lib/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

Could it be that I have made a mistake in some settings? Could you give me some advice? Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant