tarin error in ema #5

simplew2011 · 2023-10-20T06:55:23Z

(python3.8) (base) wzp@vastai-NF5468M6:~/code/model_check/GAN/stylegan2-ada-lightning$ python trainer/train_stylegan.py wandb_main=True                                                                                                 
trainer/train_stylegan.py:226: UserWarning: 
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
  @hydra.main(config_path='../config', config_name='stylegan2')
/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
Global seed set to 394
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 3
wandb: You chose "Don't visualize my results"
wandb: WARNING `resume` will be ignored since W&B syncing is set to `offline`. Starting a new run with run id 20100248_StyleGAN2_fast_dev.
wandb: Tracking run with wandb version 0.15.12
wandb: W&B syncing is set to `offline` in this directory.  
wandb: Run `wandb online` or set WANDB_MODE=online to enable cloud syncing.
batch_size = 16 / 16
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_val_batches=1.0)` was configured so 100% of the batches will be used..
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name                | Type              | Params
----------------------------------------------------------
0 | G                   | Generator         | 19.9 M
1 | D                   | Discriminator     | 21.5 M
2 | augment_pipe        | AugmentPipe       | 32    
3 | path_length_penalty | PathLengthPenalty | 1     
----------------------------------------------------------
41.5 M    Trainable params
129       Non-trainable params
41.5 M    Total params
165.819   Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:480: PossibleUserWarning: Your `val_dataloader`'s sampler has shuffling enabled, it is strongly recommended that you turn shuffling off for val/test dataloaders.
  rank_zero_warn(
Sanity Checking DataLoader 0: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 209.51it/s]Error executing job with overrides: ['wandb_main=True']
Traceback (most recent call last):
  File "trainer/train_stylegan.py", line 230, in main
    trainer.fit(model,
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 529, in fit
    call._call_and_handle_interrupt(
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 568, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 973, in _run
    results = self._run_stage()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1014, in _run_stage
    self._run_sanity_check()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1043, in _run_sanity_check
    val_loop.run()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/loops/utilities.py", line 177, in _decorator
    return loop_run(self, *args, **kwargs)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 122, in run
    return self.on_run_end()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 244, in on_run_end
    self._on_evaluation_epoch_end()
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 326, in _on_evaluation_epoch_end
    call._call_lightning_module_hook(trainer, hook_name)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 144, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/lightning_utilities/core/rank_zero.py", line 32, in wrapped_fn
    return fn(*args, **kwargs)
  File "trainer/train_stylegan.py", line 160, in on_validation_epoch_end
    self.ema.copy_to([p for p in self.G.parameters() if p.requires_grad])
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/torch_ema/ema.py", line 135, in copy_to
    parameters = self._get_parameters(parameters)
  File "/home/wzp/anaconda3/envs/python3.8/lib/python3.8/site-packages/torch_ema/ema.py", line 83, in _get_parameters
    raise ValueError(
ValueError: Number of parameters passed as argument is different from number of shadow parameters maintained by this ExponentialMovingAverage

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
wandb: Waiting for W&B process to finish... (failed 1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync ./wandb/offline-run-20231020_024844-20100248_StyleGAN2_fast_dev
wandb: Find logs at: ./wandb/offline-run-20231020_024844-20100248_StyleGAN2_fast_dev/logs

The text was updated successfully, but these errors were encountered:

simplew2011 · 2023-10-20T06:56:14Z

(python3.8) (base) wzp@vastai-NF5468M6:~/code/model_check/GAN/stylegan2-ada-lightning$ pip list
Package Version

aiohttp 3.8.6
aiosignal 1.3.1
antlr4-python3-runtime 4.9.3
appdirs 1.4.4
async-timeout 4.0.3
attrs 23.1.0
ballpark 1.4.0
certifi 2022.12.7
charset-normalizer 2.1.1
click 8.1.7
cmake 3.25.0
docker-pycreds 0.4.0
filelock 3.9.0
frozenlist 1.4.0
fsspec 2023.9.2
gitdb 4.0.10
GitPython 3.1.40
huggingface-hub 0.18.0
hydra-core 1.3.2
idna 3.4
importlib-resources 6.1.0
Jinja2 3.1.2
lightning-utilities 0.9.0
lit 15.0.7
MarkupSafe 2.1.2
mpmath 1.3.0
multidict 6.0.4
networkx 3.0
nicefid 2.1.1
numpy 1.24.1
omegaconf 2.3.0
packaging 23.2
pathtools 0.1.2
Pillow 9.3.0
pip 23.3
protobuf 4.24.4
psutil 5.9.6
pytorch-lightning 2.0.6
PyYAML 6.0.1
requests 2.28.1
resize-right 0.0.2
safetensors 0.4.0
scipy 1.10.1
sentry-sdk 1.32.0
setproctitle 1.3.3
setuptools 68.0.0
six 1.16.0
smmap 5.0.1
sympy 1.12
timm 0.9.7
torch 2.0.0+cu118
torch-ema 0.3
torchaudio 2.0.1+cu118
torchmetrics 1.2.0
torchvision 0.15.1+cu118
tqdm 4.66.1
triton 2.0.0
typing_extensions 4.4.0
urllib3 1.26.13
wandb 0.15.12
wheel 0.41.2
yarl 1.9.2
zipp 3.17.0

Suvi-dha · 2023-10-22T06:55:32Z

same error

zaidao2023 · 2023-11-06T09:24:15Z

same error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tarin error in ema #5

tarin error in ema #5

simplew2011 commented Oct 20, 2023

simplew2011 commented Oct 20, 2023

Suvi-dha commented Oct 22, 2023

zaidao2023 commented Nov 6, 2023

tarin error in ema #5

tarin error in ema #5

Comments

simplew2011 commented Oct 20, 2023

simplew2011 commented Oct 20, 2023

Suvi-dha commented Oct 22, 2023

zaidao2023 commented Nov 6, 2023