PyTorch Fundamentals - Week 1, 2, & 3
Brushing up on my PyTorch skills every week. Starting from scratch. Not in a hurry. The goal is to follow along TorchLeet and go up to karpathy/nanoGPT or karpathy/nanochat. Summary of the 1st three weeks.
Week 1
- Create a Linear Regression model.
torch.nn.Linearto define a learnable model.forward()for forward pass.model.parameterscontaining all the learned weights and also passed to the optimizer (SGD,Adam, etc.)- Use
torch.no_grad()during inferencing.
- Log the training logs to TensorBoard. (not a part of the TorchLeet repo)
SummaryWriterfromtorch.utils.tensorboard. Tensorflow can directly use a callback inside the fit function to push all the relevant logs. TheSummaryWritergives a fine-grained control to log anything.add_scalar()to log the training loss.add_graph()to log the graph itself.-
Load the Jupyter tensorboard extension so that we don’t have to leave the notebook to look at the logs and and pretty plots.
%load_ext tensorboard -
Load the tensorboard UI inside the notebook.
%tensorboard --logdir PATH_TO_LOG_DIR
Week 2
- Create a Dataset
Datasetclass fromtorch.utils.data.- Create a subclass of
Datasetfor my specific dataset. Addeddata,Xandyattributes to the class. - Since we will iterate through the rows of this dataset, defined
__len__and__getitem__functions. These overloaded functions enable code likelen(dataset)anddataset[i], respectively.
- Dataloader
-
Datasetonly defines the dataset.Dataloaderfromtorch.utils.datacreates an iterator. It also brings other capabilities like batching and shuffling. Eg:dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True) -
We can run a
forloop on thisdataloadernow.
-
- Good intro to the topic from PyTorch - Datasets and DataLoaders.
- Trained the Linear Regression model using the dataloader. Faced some issues due to
dtypemismatches - usedtorch.float32everywhere to fix it. - The exercise only asked for a single column dataset. Played around with a dataset with multiple columns.
- Use tensorboard for all the logging.
Week 3
- Two types of activation functions – with learnable parameters and without.
- Activation function with learnable parameters will require a
nn.Modulesubclass. It is required to do the gradient calculations usingforwardandbackwardfunctions and get the final trained weights. - Created a custom activation without learnable parameters: \(\text{tanh}(x) + x\).
-
Updated the Linear Regression class to have the final output go through \(\text{tanh}(x) + x\) using
torch.tanh().return self.custom_activation(self.linear*(x)) - This SO answer talks about how to write a custom activation function in different scenarios: non-learnable, learnable, learnable with PyTorch functions, and learnable without PyTorch functions.
- Also learned about
torch.nn.Parameterandtorch.nn.Variable. - Kept using Tensorboard for all the logging.
Next 2 weeks: Custom Loss Function (Huber Loss) and Deep Neural Network