Stratified Random Sampling

Stratified random sampling is a sampling method that goes one step further than simple random sampling. It can be used in situations where the population can be separated naturally into sub-groups or strata. For example, our population of 500 stocks can be separated into sub-groups, with each group representing an industry such as Energy, Automobiles, FMCG, Retail, Infrastructure, etc. Then from each stratum, we can take a random sample. The sample size taken from each stratum will depend on the size of the stratum itself with respect to the population.

Bond fund managers make use of stratified random sampling in bond indexing strategies. In a bond indexing strategy, the investor constructs a portfolio that mimics the performance of a bond index such as Barclays Capital Aggregate Bond Index.

However, a bond index usually contains thousands of bonds which makes it difficult for the investor to implement a pure indexing strategy. Even if a manager wants to do so, it will be very expensive due to the high transaction costs. In such a situation, the investor can use stratified random sampling to construct a portfolio that has the same characteristics as the index. For a bond index, the various bonds can be grouped based on certain risk factors such as duration, credit quality, coupon rate, maturity etc. Then based on the weight of each group in the population, a sample can be drawn from each group to create a sample that replicates the index in terms of these risk factors.

Let’s say there are 1000 bonds in an index. We can classify these bonds based on their duration and the coupon rate. For duration we have two intervals: below 5 years and above 5 years. For coupon we have two intervals: below 8% and above 8%. The following table shows this classification:

 Coupon below 8%Coupon above 8%
Duration below 5 years250300
Duration above 5 years200250

As you can see, there are 4 stratums. Suppose we take a sample of 100 bonds. The number of bonds selected from each cell will be as follows:

  1. D<5 years, C<8% = 250/1000*100 = 25
  2. D<5 years, C>8% = 300/1000*100 = 30
  3. D>5 years, C<8% = 200/1000*100 = 20
  4. D>5 years, C>8% = 250/1000*100 = 25

Stratified random sampling ensures that each type of bond is represented in the sample in the right proportion.

Related Downloads

Learn the skills required to excel in data science and data analytics covering R, Python, machine learning, and AI.

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Saylient AI Logo

Take the Next Step in Your Data Career

Join our membership for lifetime unlimited access to all our data analytics and data science learning content and resources.