-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduce the model #4
Comments
Hi, could you share the complete logs? As I found there are some mismatches in the currently provided logs. The executed command specifies training with codesign mode, but later logs seems to find checkpoints in folders with "fixseq" as suffix. Also, information like "Training Autoencoder with config xxx" or "Using Autoencoder checkpoint" didn't appear, which makes me lost in tracking the stages. |
OK, Thank you so much! |
Hi, looks like you need to either use a GPU with larger memory (at least 24G), or reduce the dynamic batch size here. |
Thank you very much for your answer! |
Hello, thank you for your great contribution.
I am a novice in AI modeling. When reproducing the model, I encountered the following error message. Can I ask how to solve it?
Execute command: GPU=0 bash scripts/run_exp_pipe.sh pepbench_codesign configs/pepbench/autoencoder/train_codesign.yaml configs/pepbench/ldm/train_codesign.yaml configs/pepbench/ldm/setup_latent_guidance.yaml configs/pepbench/test_codesign.yaml
Error message:
100%|██████████| 170/170 [01:25<00:00, 1.98it/s, loss=0, version=0]
0%| | 0/4157 [00:00<?, ?it/s]
100%|██████████| 4157/4157 [00:00<00:00, 1097138.29it/s]
2025-02-21 21:34:14::INFO::validating ...
0%| | 0/6 [00:00<?, ?it/s]
0%| | 0/6 [00:01<?, ?it/s]
Traceback (most recent call last):
File "/home/dengxj/yanlin/deeplearning/PepGLAD/train.py", line 78, in
main(args, opt_args)
File "/home/dengxj/yanlin/deeplearning/PepGLAD/train.py", line 71, in main
trainer.train(args.gpus, args.local_rank)
File "/home/dengxj/yanlin/deeplearning/PepGLAD/trainer/abs_trainer.py", line 253, in train
self._valid_epoch(device)
File "/home/dengxj/yanlin/deeplearning/PepGLAD/trainer/abs_trainer.py", line 155, in _valid_epoch
metric = self.valid_step(batch, self.valid_global_step)
File "/home/dengxj/yanlin/deeplearning/PepGLAD/trainer/ldm_trainer.py", line 52, in valid_step
loss, loss_dict = self.model(**batch)
ValueError: not enough values to unpack (expected 2, got 1)
cat: ./exps/pepbench_fixseq/LDM/version_0/checkpoint/topk_map.txt: 没有那个文件或目录
usage: setup_latent_guidance.py [-h] --config CONFIG --ckpt CKPT [--gpu GPU]
setup_latent_guidance.py: error: argument --ckpt: expected one argument
usage: generate.py [-h] --config CONFIG --ckpt CKPT [--save_dir SAVE_DIR]
[--gpu GPU] [--n_cpu N_CPU]
generate.py: error: argument --ckpt: expected one argument
Traceback (most recent call last):
File "/home/dengxj/yanlin/deeplearning/PepGLAD/cal_metrics.py", line 228, in
main(parse())
File "/home/dengxj/yanlin/deeplearning/PepGLAD/cal_metrics.py", line 153, in main
with open(args.results, 'r') as fin:
FileNotFoundError: [Errno 2] No such file or directory: './exps/pepbench_fixseq/LDM/version_0/results/results.jsonl'
The text was updated successfully, but these errors were encountered: