model.train() Issue #79

oussama-sil · 2024-04-26T13:17:00Z

I've been attempting to fine-tune a GPT-2 base model using Adapter from OpenDelta. While training the model, I came across this error: element 0 of tensors does not require grad and does not have a grad_fn. Upon investigating the source of the error, I discovered that it occurs after calling the .train() function of the model. Any suggestions on how to resolve this?
Code :
model= GPT2LMHeadModel.from_pretrained('gpt2',device_map=device) tokenizer = GPT2Tokenizer.from_pretrained("gpt2") tokenizer.add_tokens(['<p>']) model.resize_token_embeddings(len(tokenizer)) # Resizing the embedding layer model.gradient_checkpointing_enable() delta_model = AdapterModel(model,bottleneck_dim = 32) delta_model.freeze_module(exclude=["deltas"]) delta_model.log() optimizer = torch.optim.Adam(model.parameters(),lr=1e-4) optimizer.zero_grad() model.train() # Causing the error text = "Random str" input_ids = tokenizer(text, return_tensors='pt') out = model(input_ids['input_ids'].to(device),attention_mask =input_ids['attention_mask'].to(device), labels = input_ids['input_ids'].to(device)) out.loss.backward()

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model.train() Issue #79

model.train() Issue #79

oussama-sil commented Apr 26, 2024

model.train() Issue #79

model.train() Issue #79

Comments

oussama-sil commented Apr 26, 2024