Skip to content

Incorrect Placement of writer.close() in PyTorch Experiment Tracking #1124

Open
@Maysixi

Description

@Maysixi

Description:

In the current code for tracking experiments in 07. PyTorch Experiment Tracking, there is an issue with the placement of writer.close(). The SummaryWriter is closed after each epoch, which causes logging to stop prematurely. This leads to incomplete logs when training spans multiple epochs. The writer.close() should only be called after all epochs have finished, not after each epoch.

Code Reference:

### New: Use the writer parameter to track experiments ###
# See if there's a writer, if so, log to it
if writer:
    # Add results to SummaryWriter
    writer.add_scalars(main_tag="Loss", 
                       tag_scalar_dict={"train_loss": train_loss,
                                        "test_loss": test_loss},
                       global_step=epoch)
    writer.add_scalars(main_tag="Accuracy", 
                       tag_scalar_dict={"train_acc": train_acc,
                                        "test_acc": test_acc}, 
                       global_step=epoch)

    # Close the writer
    writer.close()  # This line causes the issue
else:
    pass
### End new ###

Proposed Solution:

Move the writer.close() statement outside the training loop, so that it is only called once after all epochs have been completed.

Expected Behavior:

The SummaryWriter should continue logging across all epochs.
Only after the full training process is complete, should the writer be closed.
##nSteps to Reproduce:
Implement the current code where writer.close() is inside the loop.
Run a training process for multiple epochs.
Notice that logging stops after the first epoch due to the writer being closed too early.

Suggested Fix:

# After training loop, close the writer
if writer:
    writer.close()

Let me know if you need any further clarifications!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions