Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

03 - Model v2 does not learn properly #1168

Closed
oabaig opened this issue Jan 15, 2025 · 4 comments
Closed

03 - Model v2 does not learn properly #1168

oabaig opened this issue Jan 15, 2025 · 4 comments

Comments

@oabaig
Copy link

oabaig commented Jan 15, 2025

I keep getting a train accuracy of 10.00% and a test accuracy of 9.99%. I have tried the course google colab version and I get the exact same. Not sure what exactly is going on, but either something changed with the model or in PyTorch that could have impacted the results.

Here is my current model v2,

class FashionMNISTModelV2(nn.Module):
    def __init__(self, input_shape: int, hidden_units: int, output_shape: int):
        super().__init__()
        self.conv_block_1 = nn.Sequential(
            nn.Conv2d(in_channels=input_shape, out_channels=hidden_units, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(in_channels=hidden_units, out_channels=hidden_units, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.conv_block_2 = nn.Sequential(
            nn.Conv2d(in_channels=hidden_units, out_channels=hidden_units, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(in_channels=hidden_units, out_channels=hidden_units, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(in_features=hidden_units*7*7, out_features=output_shape)
        )
    
    def forward(self, x):
        x = self.conv_block_1(x)
        # print(f"Output shape of conv block 1 {x.shape}")
        x = self.conv_block_2(x)
        # print(f"Output shape of conv block 2 {x.shape}")
        x = self.classifier(x)
        # print(f"Output shape of classifier {x.shape}")
        return x

Just to reiterate, I have tried the google colab notebook created by @mrdbourke and it gives me the same result.

@oabaig oabaig closed this as completed Jan 15, 2025
@oabaig
Copy link
Author

oabaig commented Jan 15, 2025

Mistakenly closed this issue because I thought I saw an issue, but the issue still seems to be persisting.

This is the result of running the block 7.4 in 03_pytorch_computer_vision.ipynb of the course notebook:

image

@oabaig oabaig reopened this Jan 15, 2025
@technOslerphile
Copy link

technOslerphile commented Jan 20, 2025

I had this same issue as @oabaig .

I solved this issue by doing this - I instantiated my loss function and optimizer AFTER instantiating the model_2 class. Due to so much code, there were two model_2 instances in my code, one before the loss and optimizer instances and one after. Now, I don't instantiate model_2 after instantiating loss and optimizer instances, but before them (as is in the lecture as well).

    def __init__(self, input_shape: int, hidden_units: int, output_shape: int):
        super().__init__()
        self.block_1 = nn.Sequential(
            nn.Conv2d(in_channels=input_shape,
                      out_channels=hidden_units,
                      kernel_size=3, # how big is the square that's going over the image?
                      stride=1, # default
                      padding=1),# options = "valid" (no padding) or "same" (output has same shape as input) or int for specific number
            nn.ReLU(),
            nn.Conv2d(in_channels=hidden_units,
                      out_channels=hidden_units,
                      kernel_size=3,
                      stride=1,
                      padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2,
                         stride=2) # default stride value is same as kernel_size
        )
        self.block_2 = nn.Sequential(
            nn.Conv2d(hidden_units, hidden_units, 3, padding=1),
            nn.ReLU(),
            nn.Conv2d(hidden_units, hidden_units, 3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            # Where did this in_features shape come from?
            # It's because each layer of our network compresses and changes the shape of our input data.
            nn.Linear(in_features=hidden_units*7*7,
                      out_features=output_shape)
        )

    def forward(self, x: torch.Tensor):
        x = self.block_1(x)
        #print(x.shape)
        x = self.block_2(x)
        #print(x.shape)
        x = self.classifier(x)
        #print(x.shape)
        return x

torch.manual_seed(42)
model_2 = FashionMNISTModelV2(input_shape=1, #number of color channels in input channel
    hidden_units=10,
    output_shape=len(class_names)).to(device)
model_2
# Setup loss and optimizer
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(params=model_2.parameters(),
                             lr=0.1)
torch.manual_seed(42)

# Measure time
from timeit import default_timer as timer
train_time_start_model_2 = timer()

# Train and test model
epochs = 3
for epoch in tqdm(range(epochs)):
    print(f"Epoch: {epoch}\n---------")
    train_step(data_loader=train_dataloader,
        model=model_2,
        loss_fn=loss_fn,
        optimizer=optimizer,
        accuracy_fn=accuracy_fn,
        device=device
    )
    test_step(data_loader=test_dataloader,
        model=model_2,
        loss_fn=loss_fn,
        accuracy_fn=accuracy_fn,
        device=device
    )

train_time_end_model_2 = timer()
total_train_time_model_2 = print_train_time(start=train_time_start_model_2,
                                           end=train_time_end_model_2,
                                           device=device)

And the output is here which looks alright to me...

Image

@oabaig
Copy link
Author

oabaig commented Jan 20, 2025

Great, I'll try this out when I get the chance!

@oabaig
Copy link
Author

oabaig commented Jan 21, 2025

Can confirm that this works on my end

@oabaig oabaig closed this as completed Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants