Playground

Move US Stocks from INDMoney to Interactive Brokers

2026-03-08T00:00:00+00:00

I wanted to move my US stocks (securities) from INDMoney to Interactive Brokers. I didn’t find any clear steps documented anywhere. It took me multiple attempts to finally be able to move over my portfolio.

This post is divided into two parts: steps followed and FAQs.

Steps

Absolute Pre-requisites

Have an active account on IBKR. That means account verified and ready to buy stocks.
Make your fractional shares whole on INDMoney. That means either going from 1.6 qty to 1 qty or 2 qtys.
Keep at least USD 70 in the INDMoney wallet.

Now, here are the steps to transfer:

Log into IBKR.
In the menu, go to Tansfers (Deposit, Withdraw and Transfer History) (or whatever they are calling it now).
Go to Transfer Positions.
Select Incoming
Select region as United States as DriveWealth, the broker used by INDMoney, is in US.
Select ACATS as the transfer method. (usually the very first option and the most convenient)
Choose DriveWealth in the broker dropdown.
My account number looked like: IFSC-XXX- (17 characters). Account number is the crucial part. One of my requests failed becuase of this. INDMoney doesn’t make it easy to find your account number. DON’T use the one shown on the app. Follow: US Stocks Tab → Manage → US Stocks Reports → US Trades Report to find your account numner.
Account Title and Tax Identification Number are pre-selected. Choose your Account Type. Mine was Individual.
Decide FULL ACATS or PARTIAL ACATS transfer. I did FULL ACATS. You can’t have any fractional shares for full transfer. Refer FAQs for more details.
Give authorisation to IBKR to take appropriate actions when the positions of you are transferring are not the products IBKR supports. IBKR has given details on what you are giving them the authorisation for. I selected Yes for everything.
Verify your details. Sign and submit.

IBKR will validate and confirm the request, submit it to the DriveWealth, receive the stocks. You will get an email from IBKR about the completion or failure.

FAQs

Q: What are the tax implications of the transfer process?
A: My research told me that transferring stocks incurs no capital gains tax, as it’s a dealer transfer, not a sale. Verify with your CA. Notwithstanding, maintain your transaction records with the earlier platform for tax filing purposes.

Q: Who is INDMoney’s broker?
A: DriveWealth

Q: Region to select for the transfer.
A: DriveWealth is a US broker. So region is United States of America.

Q: What is incoming ACATS and outgoing ACATS?
A: Since I wanted to get the securities out of DriveWealth, it is going to be an outgoing ACATS for DriveWealth and incoming ACATS for Interactive Brokers.

Q: Who initiates the ACATS (Automated Customer Account Transfer Service) transfer request?
A: Always the receiving broker. So, there is no need to interact with INDMoney unless there is some snag in the process and you need help.

Q: Is DriveWealth (INDMoney’s underlying broker) ACATS transfer enabled?
A: Yes, I confirmed this with INDMoney.

Q: Does DriveWealth charge a fee for the (outbound) transfer?
A: Yes, I was charged USD 65. I confirmed this with an INDMoney customer support request. Their explanation:

For outgoing ACATS transfers, a fee is typically applied by your U.S. broker. This charge is levied by the broker currently holding your assets to facilitate the transfer out of their system.

Q: How do I pay for the DriveWealth fee?
A: Keep at least USD 70 in the INDMoney wallet.

Q: Does IBKR charge a fee for the (inbound) transfer?
A: I was not charged anything.

Q: Is there a penalty for transfer failures due to any errors?
A: I was not charged anything. I successfully transfer in my third attempt.

Q: How long does it take to complete the transfer after submission.
A: It took 4 working days from submission to completion for me. Under standard conditions, ACATS transfers to IBKR complete in about 4–8 business days after submission.

Gemini says that some industry sources note 3–5 business days in smooth cases.

Q: Can I trade on INDMoney during my transfer process?
A: I avoided it after I raised my transfer request. INDMoney’s response:

During the outgoing ACATS transfer, trading activity (including deposits, buys, and sells) will be temporarily restricted to ensure the process completes without errors.

Q: Can I trade on IBKR during my transfer process?
A: Yes.

Q: How does it work if the residency status has changed - INDMoney app having residency A and IBKR having residency B?
A: Many people on this reddit thread seemed to think so. I tried, and it worked for me.

Q: My address is different on INDMoney and IBKR. Will it work?
A: It worked for me. According to Gemini, the ACATS system relies on an exact match of key identifying information between the delivering account (INDmoney’s partner broker) and the receiving account (IBKR) to prevent unauthorized transfers. The key matching criteria is:

Account Title/Registration: Your name must be identical on both accounts.
Account Type: (e.g., Individual, Joint, etc.) must match.
Tax ID: (e.g., Social Security Number, or other tax identification used) must match.

All three were supplied by me. So, I guess that’s enough. I think it would have mattered if my country of residence had been US, because then my Tax ID would have changed.

Q: How do I track my transfer?
A: INDMoney doesn’t tell you anything about the transfer or the status. You will not even get a notification/email after the transfer is complete. Only place to track the transfer is on IBKR. Follow the same steps that you followed to initiate the transfer. IBKR will guide to the the status window.

Q: How are fractional shares handled?
A: This caused a lot of research for me. Everyone online mentioned that you can’t transfer fractional shares. INDMoney also confirmed this:

Please note that fractional shares cannot be transferred. These may be liquidated as part of the transfer process.

So this was clear. What was not clear was: do I need to make them whole or will it only transfer the complete units and leave the fractional units in INDMoney? The latter part of INDMoney’s response gave the impression that it will handle the liquidation on it’s own. This assumption was incorrect. My request was rejected because of fractional shares.

So, as I mentioned in the pre-requisites, make your fractional shares whole or exit entirely. That is, turn 3.55 units of NVDA into 3 or 4 or 0 units or NVDA.

Q: Can I use partial transfer to avoid selling my fractional shares?
A: I was always trying for full transfer. When the request was rejected due to fractional shares, I contemplated using partial transfer to avoid exiting. I was avoiding selling for two reasons:

I wasn’t sure if selling would lead to any tax implications;
Some of my stocks were bought at very attractive prices. I want to protect my avg price.

After some thought, I decided to make them whole. Exited some of the fractional shares and bought some of the others (when there was a dip).

Q: Does the ACATS transfer also take care of transfering my buy/sell history, avg price and other associated history?
A: Not sure what is supposed to be transfered. What I got was only my avg price. All the past buy-sell history was gone. Since it was gone, my CAGR type metrics were also gone.

Q: What resources did you refer to understand the process?
A: Here are some links

PyTorch Fundamentals - Week 5

2025-11-29T00:00:00+00:00

Brushing up on my PyTorch skills every week. Starting from scratch. Not in a hurry. The goal is to follow along TorchLeet and go up to karpathy/nanoGPT or karpathy/nanochat. Previously,

Now, summary of the week 4.

Linear Regression through a custom DNN with a non-linearity

A nn.Module subclass that used a nn.Sequential to have the dense linear layer.

  dense = nn.Sequential(
      nn.Linear(input_dim, dense_dims[0]),
      nn.ReLU(),
      nn.Linear(*dense_dims),
      nn.ReLU(),
      nn.Linear(dense_dims[1], output_dim),
  )

Learnt about nn.Sequential vs nn.Module.
- nn.Sequential is a subclass of nn.Module.
- It’s a convinient way to chain functions.
- Think: replacing self.l1(self.l2(self.l3(x))) with self.layer123(x) where layer123 is a nn.Sequential of l1, l2, and l1.
- Also read: What is difference between nn.Module and nn.Sequential
Tried three different dense layer hidden unit config with each of the below loss functions:
1. nn.MSELoss()
2. HuberLoss() - created last week
3. nn.L1Loss()
Dense layer hidden unit variants: (3, 7), (4, 8), (5, 9).
Huber Loss threw MSE Loss and L1 Loss out of the water. (Figure from TensorBoard.)
TensorBoard at work
- I was running a remote training pipeline with tensorboard logging. This pipline dumps the logs on S3.
- The pipeline was failing with the following error: File system scheme 's3' not implemented.
- Solution:
  - After multiple loops of [add + commit + push + job run] found the solution: pip install tensorflow-io. This is not the end of the solution.
  - The very last comment on the tensorboard github issue gives the final trick: Add import tensorflow_io to the pipeline even if not using it anywhere.

PyTorch Fundamentals - Week 4

2025-11-22T00:00:00+00:00

Brushing up on my PyTorch skills every week. Starting from scratch. Not in a hurry. The goal is to follow along TorchLeet and go up to karpathy/nanoGPT or karpathy/nanochat. Previously,

PyTorch Fundamentals - Week 1, 2, & 3

Now, summary of the week 4.

Custom Loss function: Huber Loss
- A torch.nn.Module class having with forward() method to compute the loss.
- Huber Loss is defined as:
  \[L_{\delta}(y, \hat{y}) = \begin{cases} \frac{1}{2}(y - \hat{y})^2 & \text{for } |y - \hat{y}| \leq \delta, \\ \delta \cdot (|y - \hat{y}| - \frac{1}{2} \delta) & \text{for } |y - \hat{y}| > \delta, \end{cases}\]
  where:
  - \(y\) is the true value,
  - \(\hat{y}\) is the predicted value,
  - \(\delta\) is a threshold parameter that controls the transition between L1 and L2 loss.
- More details about the Huber Loss
Some custom losses in Keras and PyTorch: Loss Function Library - Keras & PyTorch
Used the Linear Regression model to test the custom loss.
Error 1: RuntimeError: grad can be implicitly created only for scalar outputs
- Reason: the forward() function was returning a tensor with lenght > 1. Got the hint from PyTorch forums.
- Fix: returned loss.mean() instead of loss.
Error 2: All the losses were nan: this was a genuine bug in my code.

Implementation approach 1: Use masks (my approach)

  def forward(self, y_pred, y_true):
      error = torch.abs(y_true - y_pred)

      flag1 = error <= self.d
      flag2 = 1 - error

      l2_loss = 0.5 * error**2 * flag1
      l1_loss = self.d * (error - 0.5 * self.d) * flag2
      loss = l2_loss + l1_loss
      return loss.mean()

Implementation approach 2: Use torch.where() (solution provided in TorchLeet)

  def forward(self, y_pred, y_true):
      error = torch.abs(y_true - y_pred)

      condition = error <= self.d
      loss = torch.where(condition, 0.5 * error**2, self.d * (error - 0.5 * self.d))
      return loss.mean()

Turns out, torch.where() is the most optimised way of doing this. It is vectorised and GPU-friendly. It is also a cleaner implementation of the same logic. Masking will require extra memory and extra operations (two multiplications, and one addition).
Used tensorboard to visualise the training results.
Read up more on optimizer.zero_grad().
- PyTorch accumulates gradients by default. The loss.backward() will add to the previous gradients (can be accessed by weight.grad).
- If we don’t reset the gradients using zero_grad(), the new gradient will be a combination of the old and the newly-computed gradient. Since the old gradient was already used to update the model in the last iteration, the combined gradient will point in a different direction than the minimum (or maximum.) [ref]
Q: When should we use zero_grad()? A: When we want gradient accumulation on purponse.
Q: When do we want gradient accumulation on purpose? A: In the following scenarios:
1. Large batch size with limited gpu memory. Split the batch into mini-batches. Accumulate gradients for all the mini-batches and then run optimizer.step(). Used during training on smaller GPUs.
2. Multiple loss components before a single update. Useful for multi-task learning. Losses that require multiple passes.
3. Parallel training. When model is split across devices -> accumulate the gradients across the micro-batches and then update parameters once.
4. Training with noisy gradients. Accumulate over multiple steps with noisy gradients to smooth the gradients before updating.

PyTorch Fundamentals - Week 1, 2, & 3

2025-11-10T00:00:00+00:00

Brushing up on my PyTorch skills every week. Starting from scratch. Not in a hurry. The goal is to follow along TorchLeet and go up to karpathy/nanoGPT or karpathy/nanochat. Summary of the 1st three weeks.

Week 1

Create a Linear Regression model.
- torch.nn.Linear to define a learnable model.
- forward() for forward pass.
- model.parameters containing all the learned weights and also passed to the optimizer (SGD, Adam, etc.)
- Use torch.no_grad() during inferencing.
Log the training logs to TensorBoard. (not a part of the TorchLeet repo)
- SummaryWriter from torch.utils.tensorboard. Tensorflow can directly use a callback inside the fit function to push all the relevant logs. The SummaryWriter gives a fine-grained control to log anything.
- add_scalar() to log the training loss.
- add_graph() to log the graph itself.
- Load the Jupyter tensorboard extension so that we don’t have to leave the notebook to look at the logs and and pretty plots.
```
  %load_ext tensorboard
```
- Load the tensorboard UI inside the notebook.
```
  %tensorboard --logdir PATH_TO_LOG_DIR
```

Week 2

Create a Dataset
- Dataset class from torch.utils.data.
- Create a subclass of Dataset for my specific dataset. Added data, X and y attributes to the class.
- Since we will iterate through the rows of this dataset, defined __len__ and __getitem__ functions. These overloaded functions enable code like len(dataset) and dataset[i], respectively.
Dataloader
- Dataset only defines the dataset. Dataloader from torch.utils.data creates an iterator. It also brings other capabilities like batching and shuffling. Eg:
```
  dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
```
- We can run a for loop on this dataloader now.
Good intro to the topic from PyTorch - Datasets and DataLoaders.
Trained the Linear Regression model using the dataloader. Faced some issues due to dtype mismatches - used torch.float32 everywhere to fix it.
The exercise only asked for a single column dataset. Played around with a dataset with multiple columns.
Use tensorboard for all the logging.

Week 3

Two types of activation functions – with learnable parameters and without.
Activation function with learnable parameters will require a nn.Module subclass. It is required to do the gradient calculations using forward and backward functions and get the final trained weights.
Created a custom activation without learnable parameters: \(\text{tanh}(x) + x\).
Updated the Linear Regression class to have the final output go through \(\text{tanh}(x) + x\) using torch.tanh().
```
  return self.custom_activation(self.linear*(x))
```
This SO answer talks about how to write a custom activation function in different scenarios: non-learnable, learnable, learnable with PyTorch functions, and learnable without PyTorch functions.
Also learned about torch.nn.Parameter and torch.nn.Variable.
Kept using Tensorboard for all the logging.

Next 2 weeks: Custom Loss Function (Huber Loss) and Deep Neural Network

Life Logging: Calls

2025-09-28T00:00:00+00:00

I am building a comprehensive set of tools to do life logging. General idea is:

Push everything to a sink; and
Visualise the data in this sink.

Objective is to do weekly reviews and take interventions if things are not BAU. Long term vision is to eventually have enough signals to give me a comprehensive understanding of myself (physical, mental, social, financial, etc).

This post is about logging calls using Tasker for Android.

Logging Phone Calls

Steps:

Trigger on the event “Phone Idle” (whenever the phone goes to an idle state - incoming call, missed call, and outgoing call)
Read the data provider content://call_log/calls to get the most recent call details
Format the details for my use
Push to an an endpoint that saves this data in a table.

This is how the logs looks like:

#call(40) +type[miss] @num[Friend Number] @name[Friend nName] +mode[phone] +add[My Location]

#call(40) +type[in] @num[Friend Number] @name[Friend nName] +mode[phone] +add[My Location]

#call(84) +type[out] @num[Unsaved Caller’s Number] @name[] +mode[phone] +add[My Location]

Tasker profile to log phone calls

[Mini] Life Logging

2025-09-27T00:00:00+00:00

I am building a comprehensive set of tools to do life logging. General idea is:

Push everything to a sink; and
Visualise the data in this sink.

Current progress in reverse chronology:

2025: Logging phone calls

I have tried this multile times in various formats over the years. Here are my previous efforts in reverse chronology:

2023: Google Fit data sync and analysis – my most successful attempt and still in-use. Pushed me to keep things simple.
2021: Futter app to do the logging 3 – couldn’t manage building this along with work.
2021: Futter app to do the logging 2
2021: Futter app to do the logging 1
2018: Analysis of my WA chats 2 – good analysis on my chatting habits, but nothing new. Led me to work on some solo research projects.
2016: Analysis of my WA chats
2014: Gamification of Life – too much information to manage, eventually started feeling like a chore.

Bye.

Consolidated Recommendation Systems

2025-02-13T00:00:00+00:00

This post is a quick summary of Lessons Learnt From Consolidating ML Models in a Large Scale Recommendation System. I have also added a few questions I got while reading it. I end the post with what we do at work to deal with this.

Summary

Recommendation System: candidate gen + ranking.
A typical ranking model pipeline:
1. Label prep
2. Feature prep
3. Model training
4. Model evaluation
5. Model deployment (with inference contract)
Each recommendation use case (e.g.: discover page, notifications, related items, category exploration, search) will have a version of the above pipeline.
As use cases increase, the team will need to maintain multiple such pipelines. It is time-consuming to maintain multiple pipelines and increases points of failure.

Figure 1: Figure from the Netflix blog linked at the start.
Since the pipelines have the same component, we can consolidate them.
Consolidated pipeline:
1. Label prep for each use case separately
2. Stratified union of all the prepared labels
3. Feature prep (separate categorical feature representing the use case)
4. Model training
5. Model evaluation
6. Model deployment (with inference contract)
Figure 2: Figure from the Netflix blog linked at the start.
Label prep for each use case separately
1. Each use case will have different ways of generating the labels.
2. Use case context details are added as separate features.
  - Search context: search query, region
  - Similar items context: source item
3. When the use case is search, context features specific to the similar item use case will be filled with default values.
Union of all the prepared labels
1. Final labelled set: a% samples from use case-1 labels + b% samples from use case-2 labels + … + z% samples from use case-n labels
2. The proportions [a, b, …, z] come from stratification
3. Q: How is this stratification done? Platform traffic across different use cases?
4. Q: What are the results when these proportions are business-driven? Eg: contribution to revenue.
Feature prep
1. All use case specific features added to the data.
2. If a feature is only used for use case 1 then it will contain default value for all the other use cases.
3. Add a new categorical feature task_type to the features to inform the model about the target reco task.
Model training happens as usual: feature vector and labels. Architecture remains the same. Optimisation remains the same.
Model evaluation
1. Check the appropriate eval metrics to check the model.
2. Q: How do we judge if the model performed well for all the use cases?
3. Q: Will it require a separate evaluation set for each use case?
4. Q: Can there be a 2nd order Simpson’s paradox here: the consolidated model performs well, but when tried for individual use cases, its performance is low? My hunch: no.
Model deployment (with inference contract)
1. Deploy the same model in the respective environment made for each use case. That env will have all the specific network-related knobs: batch size, throughput, latency, caching policy, parallelism, etc.
2. Generic API contract to support the heterogenous context (search query for search, source item for related items use case.)
Caveats
1. The consolidated use cases should be related (eg: ranking for movies in the search and discover page)
2. One definition of related can be: ranking the same entities.
Advantages
1. Reduces maintenance costs (less code; fewer deployments)
2. Quick model iterations to all the use cases
  - Updates (new features, architecture, etc) for one use case can be applied to other use cases.
  - If consolidated tasks are related, then new features don’t cause regression in practice.
3. Can be extended to any related use case from offline and online POV.
4. Cross-learning: the model potentially gains more (hidden) learning from the other tasks. Eg: having search data gives more data to the model learning for related-items task.
  - Q: Is this happening? How can we verify this? One way: Train an independent model on the use-case specific data and compare its performance with the consolidated model’s performance on the same task.
I was confused about what to call this learning paradigm. Wikipedia says that it is multi-task learning.

Practice at my work

The models are not merged across different tasks like relevance and search.
Within relevance ranking tasks (discover, similar items, category exploration), have a common base ranker model.
On top of that, we have different heuristics to make it better for that particular section.
Advantages:
- There is only one main model for all related tasks.
- Keeps the heuristics logic simple and, thus, easy to maintain.
Challenges
- Heuristics are crude/manual/semi-automated → we may be leaving some gains on the table. There are bandit-based approaches to automating it, though.
- It loses out on cross-learning opportunities.

Document Your Progress at Work

2025-01-13T00:00:00+00:00

How can you ensure that your contributions are also recognized?

A common challenge, especially in larger organizations, is that your manager may not always be fully aware of the specifics of your work, and your manager’s manager likely has even less visibility. It isn’t due to a lack of interest but rather the sheer volume of responsibilities and information they handle. Additionally, even for you, it’s hard to remember all the details beyond the highlights. I find a proactive strategy essential for such scenarios: sending regular progress digests.

These digests are concise, structured email updates that you send periodically to both direct manager and their manager. The aim is to offer a clear snapshot of your activities, their impact, and your forthcoming plans. See it as a method to keep your supervisors well-informed, especially when you lack regular direct interactions.

That’s it. That is the idea. You can be creative and apply it however you want. However you decide to do it, you will see gains.

In the next section, I list the key points I usually consider in my snapshots.

Key Elements of an Effective Progress Digest

To ensure your digests are both informative and impactful, here’s what you can include:

Specific Task Details: Provide project specifics and relevant links to the completed/picked coding tasks. It entails a 1-sentence project description, PR links, JIRA tickets and other code artefacts.
Data Science Related: If applicable, detail the models you’ve trained and deployed. Any A/B experiments launched and test results of the ones that concluded. Also, share the project solutioning doc here.
Documentation Efforts: Highlight any documentation you’ve created or maintained. You can also merge this with other points.
Impact and Results: Clearly articulate the outcomes of your tasks and their value to the team and company.
Initiatives and Discussions: Share any new ideas you’ve put forward or discussions you’ve initiated.
Future Plans: Outline your planned next steps.

Benefits

The effort invested in creating these digests yields substantial career benefits:

Enhances Diligence: Summarizing your work makes you more conscious of your efforts.
Boosts Positive Perception: You are perceived as a proactive and accomplished individual.
Creates a Performance Record: These digests serve as valuable documentation of your work, valuable during performance reviews.
Ensures Visibility: Even if managers don’t respond directly to each email, they will read them, which ensures they are aware of your work and its progress.
Effective at Any Stage: While this practice is advantageous when starting a new job (or joining a new team), I have found it beneficial at any stage.

Conclusion

Actively managing your visibility is key to long-term career growth. Sending out regular progress digests ensures that your work is recognized. You also establish a record of your accomplishments and demonstrate your value. This practice requires regular work but has good returns.

PS: I learned this trick on a tech podcast many years ago. If anyone knows which podcast or episode, please share it with me, and I will link it here.

Update: 14th Jan

PS: A related idea of brag documents explained beautifully by Julia Evans. Shared on this HN comment.

Confidence Intervals and Coverage

2024-09-15T00:00:00+00:00

Confidence Interval (CI)

CI is an interval.
An interval which is exepected to contain the parameter being estimated (eg: population mean.)
Typical confidence levels are 95% and 99%.
The confidence level of a confidence interval is called Nominal coverage (probability.)
CI with 95% confidence: random interval which contains the parameter to be estimated 95% of the time.
Two ways to mention a confidence level of 95%:
- Confidence interval with \(\gamma = 0.95\); 95% confidence
- Confidence interval with \(\alpha = 0.05\); 95% confidence: \(1-\alpha = 0.95\)
Mathematical representation
\[P(u(X)<\theta
\(\theta\) is the parameter to be estimated (eg: population mean or median).
\(X\) is a random variable from a probability distribution with parameter \(\theta\)
\(u(X)\) and \(v(X)\) are random variables containing parameter \(\theta\) with probability \(\gamma\)
Confidence level \(\gamma\) < 1 (but close to 1). eg: 0.95

Mathematical representation in case of normal distribution:

\[\text{CI} = \bar{x} \pm z^* \left(\frac{\sigma}{\sqrt{n}}\right)\]

Where:

\(\bar{x}\) is the sample mean.
\(z^*\) is the critical value corresponding to the desired confidence level
\(\sigma\) is the population standard deviation.
\(n\) is the sample size.
The quantity \(\displaystyle {\sigma }_{\bar {x}}={\frac {\sigma }{\sqrt {n}}}\) is also called the standard error of the mean.
The 95% confidence level will correspond to the 97.5th percentile of the distribution. Reason: the probability of \(\theta\) lying outside the 95% confidence level is 5%. So, 2.5% probability on both sides (if symmetric). So the range becomes 2.5% to (95+2.5)%.

We can calculate the critical value \(z^*\) as follows:

If the sample size is small (< 30) or we do not know the std dev, then we use the t-statistic (Student’s t-distribution.) The t-distribution is wider and has heavier tails than the normal distribution, reflecting the increased uncertainty in small samples. Thus, it accounts for the extra variability.
If the sample size is large enough to make CLT valid, we use normal distribution (Z-distribution) –> z-score. Eg: a z-score of 1.96 for a 95% confidence level.
As the sample size increases, both methods converge.
It is better to use the t-statistic.

Ref:

CI Width

The narrower the width, the higher the confidence.
Factors that impact the width of CI are sample size, variance/standard deviation, and confidence level.
- Sample size high –> narrow CI
- High variance/standard dev –> wider CI
- Higher confidence level –> wider CI (more data will lie under a higher confidence level)

Coverage

Coverage (probability): the probability that a confidence interval will include the true value (eg: population mean.)
The proportion of CIs (at a particular confidence level) that contain the true value (eg: population mean.)
95% CI coverage: For example, if you calculate a 95% confidence interval for a population mean, you are saying that if you were to take many samples and calculate a confidence interval from each one, approximately 95% of those intervals would contain the true population mean.
Probability matching: if coverage probability is the same as nominal coverage probability.
- Nominal coverage = 50%
- Coverage = 10/20 = 50% (blue CIs contain the true mean)
- Probability matching since coverage is the same as nominal coverage.
- Image ref
Ref:
- Coverage probability wiki
- Confidence interval construction

Implementation and Explorations

Now, we will go through the above concepts in code.

Common imports

Lognormal to Normal Distribution

2024-01-14T00:00:00+00:00

The Normal and lognormal distributions are fundamental concepts in statistics. I recently used the relationship between these two distributions in a project. In this blog post, I want to share what I learned.

Outline

Normal & Lognormal Distributions
Lognormal to Normal
Normal to Lognormal
Conclusion

Normal & Lognormal Distributions

The normal distribution is also called the bell curve or Gaussian distribution. The bell height represents the mean position, and the bottom width of the bell represents the spread of values (standard deviation). Thus, the shape changes as we change mu (\(\mu\)) and sigma (\(\sigma\)). The \(\mu\) is the mean or average of the sample, and \(\sigma\) is the standard deviation. We denote a normal distribution as:

\[{\mathcal {N}}(\mu ,\sigma ^{2})\]

Find more details about the normal distribution on Wikipedia. Here are two ways of defining a normal distribution in Python.

Using python stdlib

1
2
3
from statistics import NormalDist
mu, sigma = 5, .5
norm_dist = NormalDist(mu, sigma)

Using scipy

1
2
3
import scipy.stats as stats
mu, sigma = 5, .5
norm_dist = stats.norm(mu, sigma)

We get a lognormal distribution when we apply exponentiation to the normal distribution. The result is a lopsided curve. It means that there is a longer tail on the right side, where larger values occur. We denote the lognormal distribution as follows:

\[{\displaystyle \ X\sim \operatorname {Lognormal} \left(\ \mu _{x},\sigma _{x}^{2}\ \right)\ }\]

Since the log of the lognormal distribution is a normal distribution, we can denote the relationship as follows:

\[{\displaystyle \ln(X)\sim {\mathcal {N}}(\mu ,\sigma ^{2})}\]

Find more details about the lognormal distribution on Wikipedia. We define a lognormal distribution in Python as follows. The Python stdlib does not have a lognormal implementation.

1
2
3
4
import numpy as np
import scipy.stats as stats
mu, sigma = 5, .5
norm_dist = stats.lognorm(s=sigma, scale=np.exp(mu))

Note: the scipy.stats.lognorm takes mu and sigma of the underlying normal distribution from which we derive the lognormal distribution. While providing the scale parameter, we take the exponentiation of the mean of the normal distribution. I found the documentation inadequate in explaining the parameters. This SO question has answers that discuss the meaning of the parameters.

Here is how both the distributions look for the same mu (\(\mu\)) and sigma (\(\sigma\)).

Code to generate the below plot.