# Stratified Random Sampling

Stratified random sampling is a sampling method that goes one step further than simple random sampling. It can be used in situations where the population can be separated naturally into sub-groups or strata. For example, our population of 500 stocks can be separated into sub-groups, with each group representing an industry such as Energy, Automobiles, FMCG, Retail, Infrastructure, etc. Then from each stratum, we can take a random sample. The sample size taken from each stratum will depend on the size of the stratum itself with respect to the population.

Bond fund managers make use of stratified random sampling in bond indexing strategies. In a bond indexing strategy, the investor constructs a portfolio that mimics the performance of a bond index such as Barclays Capital Aggregate Bond Index.

However, a bond index usually contains thousands of bonds which makes it difficult for the investor to implement a pure indexing strategy. Even if a manager wants to do so, it will be very expensive due to the high transaction costs. In such a situation, the investor can use stratified random sampling to construct a portfolio that has the same characteristics as the index. For a bond index, the various bonds can be grouped based on certain risk factors such as duration, credit quality, coupon rate, maturity etc. Then based on the weight of each group in the population, a sample can be drawn from each group to create a sample that replicates the index in terms of these risk factors.

Let’s say there are 1000 bonds in an index. We can classify these bonds based on their duration and the coupon rate. For duration we have two intervals: below 5 years and above 5 years. For coupon we have two intervals: below 8% and above 8%. The following table shows this classification:

| Coupon below 8% | Coupon above 8% |

Duration below 5 years | 250 | 300 |

Duration above 5 years | 200 | 250 |

As you can see, there are 4 stratums. Suppose we take a sample of 100 bonds. The number of bonds selected from each cell will be as follows:

- D<5 years, C<8% = 250/1000*100 = 25
- D<5 years, C>8% = 300/1000*100 = 30
- D>5 years, C<8% = 200/1000*100 = 20
- D>5 years, C>8% = 250/1000*100 = 25

Stratified random sampling ensures that each type of bond is represented in the sample in the right proportion.

- Simple Random Sampling and Sampling Distribution
- Sampling Error
- Stratified Random Sampling
- Time Series and Cross Sectional Data
- Central Limit Theorem
- Standard Error of the Sample Mean
- Parameter Estimation
- Point Estimates
- Confidence Interval Estimates
- Confidence Interval for a Population mean, with a known Population Variance
- Confidence Interval for a Population mean, with an Unknown Population Variance
- Confidence Interval for a Population Mean, when the Distribution is Non-normal
- Student’s t Distribution
- How to Read Student’s t Table
- Biases in Sampling

# R Programming Bundle: 25% OFF

**R Programming - Data Science for Finance Bundle**for just $29 $39.