Stratified Random Sampling

Stratified random sampling is a sampling method that goes one step further than simple random sampling. It can be used in situations where the population can be separated naturally into sub-groups or strata. For example, our population of 500 stocks can be separated into sub-groups, with each group representing an industry such as Energy, Automobiles, FMCG, Retail, Infrastructure, etc. Then from each stratum, we can take a random sample. The sample size taken from each stratum will depend on the size of the stratum itself with respect to the population.

Bond fund managers make use of stratified random sampling in bond indexing strategies. In a bond indexing strategy, the investor constructs a portfolio that mimics the performance of a bond index such as Barclays Capital Aggregate Bond Index.

However, a bond index usually contains thousands of bonds which makes it difficult for the investor to implement a pure indexing strategy. Even if a manager wants to do so, it will be very expensive due to the high transaction costs. In such a situation, the investor can use stratified random sampling to construct a portfolio that has the same characteristics as the index. For a bond index, the various bonds can be grouped based on certain risk factors such as duration, credit quality, coupon rate, maturity etc. Then based on the weight of each group in the population, a sample can be drawn from each group to create a sample that replicates the index in terms of these risk factors.

Let’s say there are 1000 bonds in an index. We can classify these bonds based on their duration and the coupon rate. For duration we have two intervals: below 5 years and above 5 years. For coupon we have two intervals: below 8% and above 8%. The following table shows this classification:

 Coupon below 8%Coupon above 8%
Duration below 5 years250300
Duration above 5 years200250

As you can see, there are 4 stratums. Suppose we take a sample of 100 bonds. The number of bonds selected from each cell will be as follows:

  1. D<5 years, C<8% = 250/1000*100 = 25
  2. D<5 years, C>8% = 300/1000*100 = 30
  3. D>5 years, C<8% = 200/1000*100 = 20
  4. D>5 years, C>8% = 250/1000*100 = 25

Stratified random sampling ensures that each type of bond is represented in the sample in the right proportion.

Related Downloads

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book includes PDFs, explanations, instructions, data files, and R code for all examples.

Get the Bundle for $29 (Regular $57)
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book comes with PDFs, detailed explanations, step-by-step instructions, data files, and complete downloadable R code for all examples.