Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tarin error in ema #5

Open
simplew2011 opened this issue Oct 20, 2023 · 3 comments
Open

tarin error in ema #5

simplew2011 opened this issue Oct 20, 2023 · 3 comments

Comments

@simplew2011
Copy link

(python3.8) (base) wzp@vastai-NF5468M6:~/code/model_check/GAN/stylegan2-ada-lightning$ python trainer/train_stylegan.py wandb_main=True                                                                                                 
trainer/train_stylegan.py:226: UserWarning: 
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
  @hydra.main(config_path='../config', config_name='stylegan2')
/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
Global seed set to 394
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 3
wandb: You chose "Don't visualize my results"
wandb: WARNING `resume` will be ignored since W&B syncing is set to `offline`. Starting a new run with run id 20100248_StyleGAN2_fast_dev.
wandb: Tracking run with wandb version 0.15.12
wandb: W&B syncing is set to `offline` in this directory.  
wandb: Run `wandb online` or set WANDB_MODE=online to enable cloud syncing.
batch_size = 16 / 16
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_val_batches=1.0)` was configured so 100% of the batches will be used..
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name                | Type              | Params
----------------------------------------------------------
0 | G                   | Generator         | 19.9 M
1 | D                   | Discriminator     | 21.5 M
2 | augment_pipe        | AugmentPipe       | 32    
3 | path_length_penalty | PathLengthPenalty | 1     
----------------------------------------------------------
41.5 M    Trainable params
129       Non-trainable params
41.5 M    Total params
165.819   Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:480: PossibleUserWarning: Your `val_dataloader`'s sampler has shuffling enabled, it is strongly recommended that you turn shuffling off for val/test dataloaders.
  rank_zero_warn(
Sanity Checking DataLoader 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 209.51it/s]Error executing job with overrides: ['wandb_main=True']
Traceback (most recent call last):
  File "trainer/train_stylegan.py", line 230, in main
    trainer.fit(model,
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 529, in fit
    call._call_and_handle_interrupt(
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 568, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 973, in _run
    results = self._run_stage()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1014, in _run_stage
    self._run_sanity_check()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1043, in _run_sanity_check
    val_loop.run()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/loops/utilities.py", line 177, in _decorator
    return loop_run(self, *args, **kwargs)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 122, in run
    return self.on_run_end()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 244, in on_run_end
    self._on_evaluation_epoch_end()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 326, in _on_evaluation_epoch_end
    call._call_lightning_module_hook(trainer, hook_name)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 144, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/lightning_utilities/core/rank_zero.py", line 32, in wrapped_fn
    return fn(*args, **kwargs)
  File "trainer/train_stylegan.py", line 160, in on_validation_epoch_end
    self.ema.copy_to([p for p in self.G.parameters() if p.requires_grad])
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/torch_ema/ema.py", line 135, in copy_to
    parameters = self._get_parameters(parameters)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/torch_ema/ema.py", line 83, in _get_parameters
    raise ValueError(
ValueError: Number of parameters passed as argument is different from number of shadow parameters maintained by this ExponentialMovingAverage

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
wandb: Waiting for W&B process to finish... (failed 1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync ./wandb/offline-run-20231020_024844-20100248_StyleGAN2_fast_dev
wandb: Find logs at: ./wandb/offline-run-20231020_024844-20100248_StyleGAN2_fast_dev/logs
@simplew2011
Copy link
Author

(python3.8) (base) wzp@vastai-NF5468M6:~/code/model_check/GAN/stylegan2-ada-lightning$ pip list
Package Version


aiohttp 3.8.6
aiosignal 1.3.1
antlr4-python3-runtime 4.9.3
appdirs 1.4.4
async-timeout 4.0.3
attrs 23.1.0
ballpark 1.4.0
certifi 2022.12.7
charset-normalizer 2.1.1
click 8.1.7
cmake 3.25.0
docker-pycreds 0.4.0
filelock 3.9.0
frozenlist 1.4.0
fsspec 2023.9.2
gitdb 4.0.10
GitPython 3.1.40
huggingface-hub 0.18.0
hydra-core 1.3.2
idna 3.4
importlib-resources 6.1.0
Jinja2 3.1.2
lightning-utilities 0.9.0
lit 15.0.7
MarkupSafe 2.1.2
mpmath 1.3.0
multidict 6.0.4
networkx 3.0
nicefid 2.1.1
numpy 1.24.1
omegaconf 2.3.0
packaging 23.2
pathtools 0.1.2
Pillow 9.3.0
pip 23.3
protobuf 4.24.4
psutil 5.9.6
pytorch-lightning 2.0.6
PyYAML 6.0.1
requests 2.28.1
resize-right 0.0.2
safetensors 0.4.0
scipy 1.10.1
sentry-sdk 1.32.0
setproctitle 1.3.3
setuptools 68.0.0
six 1.16.0
smmap 5.0.1
sympy 1.12
timm 0.9.7
torch 2.0.0+cu118
torch-ema 0.3
torchaudio 2.0.1+cu118
torchmetrics 1.2.0
torchvision 0.15.1+cu118
tqdm 4.66.1
triton 2.0.0
typing_extensions 4.4.0
urllib3 1.26.13
wandb 0.15.12
wheel 0.41.2
yarl 1.9.2
zipp 3.17.0

@Suvi-dha
Copy link

same error

1 similar comment
@zaidao2023
Copy link

same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants