Deep Learning Notebook — CIFAR-10 with ResNet, Transfer Learning & SupCon — Machine Learning

Overview

This chapter implements and trains ResNet-based classifiers on CIFAR-10, explores transfer learning with frozen ImageNet features, and applies supervised contrastive learning with a linear evaluation protocol.

You Will Learn

Training ResNet18 from scratch on CIFAR-10 with data augmentation
Adapting a pretrained ResNet to CIFAR-10 via feature extraction and fine-tuning
Implementing supervised contrastive loss in PyTorch
Comparing confusion matrices across different training regimes

Main Content

Training ResNet18 from Scratch

You define data pipelines for CIFAR-10 with standard augmentations (random crop with padding, horizontal flip, normalisation) and train ResNet18 using SGD with momentum and a learning rate schedule. Tracking training and test accuracy across epochs reveals typical deep learning behaviour: rapid initial improvements, slower later gains, and potential overfitting if training continues too long without regularisation.

Transfer Learning on CIFAR-10

You compare training from scratch with using an ImageNet-pretrained ResNet18 as a fixed feature extractor (only training the final linear layer) and with full fine-tuning. Feature extraction is fast and competitive; fine-tuning can further improve performance when training data is sufficient, illustrating the power and flexibility of transfer learning.

Supervised Contrastive Training and Linear Evaluation

You implement supervised contrastive loss by drawing multiple augmented views of each image in a batch, encoding them, and computing pairwise similarities in the embedding space. The loss pulls together embeddings with the same label and pushes apart different labels. After training the encoder, you freeze it and train a linear classifier on top of the embeddings, comparing performance to standard cross-entropy baselines.

Analysing Confusion Matrices

For each training regime (from scratch, feature extraction, fine-tuning, SupCon + linear head) you compute confusion matrices. These reveal which class pairs (e.g., cat vs dog, truck vs automobile) are systematically confounded and how different approaches shift these patterns. This analysis complements scalar metrics like accuracy and can guide targeted data augmentation or model changes.

Examples

Two-Head SupCon Model Sketch

Encoder + projection head for supervised contrastive learning.

class SupConModel(nn.Module):
    def __init__(self, base_encoder: nn.Module, emb_dim: int = 128):
        super().__init__()
        self.encoder = base_encoder
        self.projection = nn.Sequential(
            nn.Linear(512, 512), nn.ReLU(), nn.Linear(512, emb_dim)
        )

    def forward(self, x):
        feats = self.encoder(x)
        z = self.projection(feats)
        return F.normalize(z, dim=1)

Common Mistakes

Neglecting normalisation and augmentation consistent with pretrained weights

Why: Pretrained models expect inputs preprocessed in specific ways; mismatches degrade transfer performance.

Fix: Use the normalisation parameters and preprocessing pipeline recommended for the pretrained model.

Implementing contrastive loss without careful temperature scaling or normalisation

Why: Unnormalised embeddings and poor temperature choices can slow training and lead to collapsed representations.

Fix: L2-normalise embeddings and tune the temperature hyperparameter based on validation metrics.

Mini Exercises

1. Compare training time and accuracy for (a) ResNet18 from scratch, (b) feature extraction with frozen backbone, and (c) fine-tuning the whole network on CIFAR-10.

2. Visualise embeddings from a SupCon-trained encoder using t-SNE or UMAP. How do class clusters compare to embeddings from a cross-entropy-trained model?

Deep Learning Notebook — CIFAR-10 with ResNet, Transfer Learning & SupCon