You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have followed all the steps so far up to the first training of model_0. However, my results are poor (and unchanging). For example: Train loss: 2.31847 | Test loss: 2.31906, Test acc: 10.85%. This is the value I always get--it doesn't change. Please what can I do to fix it? I've tried debugging for some time, now.
# Import tqdm for progress barfromtqdm.autoimporttqdm# set the seed and start the timertorch.manual_seed(42)
train_time_start_on_cpu=timer()
# Set the number of epochsepochs=3# Training and Testing Loopforepochintqdm(range(epochs)):
print(f"Epoch: {epoch}\n-------")
### Trainingtrain_loss=0# Add a loop to loop through training batchesforbatch, (X, y) inenumerate(train_dataloader):
model_0.train()
# 1. Forward pass# print(X)y_pred=model_0(X)
# print(y_pred)# 2. Calculate loss (per batch)loss=loss_fn(y_pred, y)
train_loss+=loss# accumulatively add up the loss per epoch# 3. Optimizer zero gradoptimizer.zero_grad()
# 4. Loss backwardloss.backward()
# 5. Optimizer stepoptimizer.step()
# Print out how many samples have been seenifbatch%400==0:
print(f"Looked at {batch*len(X)} / {len(train_dataloader.dataset)} samples")
# Divide total train loss by length of train dataloader (average loss per batch per epoch)train_loss/=len(train_dataloader)
### Testing# Setup variables for accumulatively adding up loss and accuracytest_loss, test_acc=0, 0model_0.eval()
withtorch.inference_mode():
forX, yintest_dataloader:
test_pred=model_0(X)
test_loss+=loss_fn(test_pred, y) # accumulatively add up the loss per epochtest_acc+=accuracy_fn(y_true=y, y_pred=test_pred.argmax(dim=1))
# Calculations on test metrics need to happen inside torch.inference_mode()# Divide total test loss by length of test dataloader (per batch)test_loss/=len(test_dataloader)
# Divide total accuracy by length of test dataloader (per batch)test_acc/=len(test_dataloader)
print(f"\nTrain loss: {train_loss:.5f} | Test loss: {test_loss:.5f}, Test acc: {test_acc:.2f}%\n")
# Calculate training timetrain_time_end_on_cpu=timer()
total_train_time_model_0=print_train_time(start=train_time_start_on_cpu,
end=train_time_end_on_cpu,
device=str(next(model_0.parameters()).device))
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have followed all the steps so far up to the first training of
model_0
. However, my results are poor (and unchanging). For example:Train loss: 2.31847 | Test loss: 2.31906, Test acc: 10.85%
. This is the value I always get--it doesn't change. Please what can I do to fix it? I've tried debugging for some time, now.My Model Class:
My optimizer & loss function:
Training Code:
Beta Was this translation helpful? Give feedback.
All reactions