Description
Description:
In the current code for tracking experiments in 07. PyTorch Experiment Tracking, there is an issue with the placement of writer.close(). The SummaryWriter is closed after each epoch, which causes logging to stop prematurely. This leads to incomplete logs when training spans multiple epochs. The writer.close() should only be called after all epochs have finished, not after each epoch.
Code Reference:
### New: Use the writer parameter to track experiments ###
# See if there's a writer, if so, log to it
if writer:
# Add results to SummaryWriter
writer.add_scalars(main_tag="Loss",
tag_scalar_dict={"train_loss": train_loss,
"test_loss": test_loss},
global_step=epoch)
writer.add_scalars(main_tag="Accuracy",
tag_scalar_dict={"train_acc": train_acc,
"test_acc": test_acc},
global_step=epoch)
# Close the writer
writer.close() # This line causes the issue
else:
pass
### End new ###
Proposed Solution:
Move the writer.close() statement outside the training loop, so that it is only called once after all epochs have been completed.
Expected Behavior:
The SummaryWriter should continue logging across all epochs.
Only after the full training process is complete, should the writer be closed.
##nSteps to Reproduce:
Implement the current code where writer.close() is inside the loop.
Run a training process for multiple epochs.
Notice that logging stops after the first epoch due to the writer being closed too early.
Suggested Fix:
# After training loop, close the writer
if writer:
writer.close()
Let me know if you need any further clarifications!