Brushing up on my PyTorch skills every week. Starting from scratch. Not in a hurry. The goal is to follow along TorchLeet and go up to karpathy/nanoGPT or karpathy/nanochat. Previously,

  1. PyTorch Fundamentals - Week 4
  2. PyTorch Fundamentals - Week 1, 2, & 3

Now, summary of the week 4.

  • Linear Regression through a custom DNN with a non-linearity
    • A nn.Module subclass that used a nn.Sequential to have the dense linear layer.
        dense = nn.Sequential(
            nn.Linear(input_dim, dense_dims[0]),
            nn.ReLU(),
            nn.Linear(*dense_dims),
            nn.ReLU(),
            nn.Linear(dense_dims[1], output_dim),
        )
      
  • Learnt about nn.Sequential vs nn.Module.
    • nn.Sequential is a subclass of nn.Module.
    • It’s a convinient way to chain functions.
    • Think: replacing self.l1(self.l2(self.l3(x))) with self.layer123(x) where layer123 is a nn.Sequential of l1, l2, and l1.
    • Also read: What is difference between nn.Module and nn.Sequential
  • Tried three different dense layer hidden unit config with each of the below loss functions:
    1. nn.MSELoss()
    2. HuberLoss() - created last week
    3. nn.L1Loss()

    Dense layer hidden unit variants: (3, 7), (4, 8), (5, 9).

  • Huber Loss threw MSE Loss and L1 Loss out of the water. (Figure from TensorBoard.)
  • TensorBoard at work
    • I was running a remote training pipeline with tensorboard logging. This pipline dumps the logs on S3.
    • The pipeline was failing with the following error: File system scheme 's3' not implemented.
    • Solution:
      • After multiple loops of [add + commit + push + job run] found the solution: pip install tensorflow-io. This is not the end of the solution.
      • The very last comment on the tensorboard github issue gives the final trick: Add import tensorflow_io to the pipeline even if not using it anywhere.