Representative Sampling

Statisticians collect data and arrive at conclusions for various hypotheses. In an ideal world where costs, time, and resources were endlessly available, they would undertake a survey of the entire population in which the respondents meet the criteria of the study. Since this is not feasible, they do the next best thing, they choose a sample that represents the population and collect data. The findings from the sample are then extrapolated to the population.

Representative sampling is not true random sampling in which each unit has an equal chance of being selected for the sample. Instead, each unit selected for the sampling will have the characteristics of the population under study.

It is important to document what constitutes the representative sample, that is, the criteria on which the sample is considered representative.

In the pharmaceutical industry, the representative sample for a batch of medicine vials could be a vial from the start, middle, or end of the production run.

Collecting Representative Sample

The common ways to collect a representative sample are:

  • Simple random sampling
  • Quota sampling
  • Probability sampling
  • Non-probability sampling

Learn more about sampling methods.

Simple random sampling ensures minimum selection bias. If the selection method involves judgment, then it is important to go over the details of the criteria and expertise of the selector who will pick out the samples.

The representative sample can be neither too big nor too small. It also must represent subsets within the sample adequately. If the representative sample is too large, it will mean an escalation of costs, time and other resources. The benefits of surveying a larger than required representative sample are not statistically significant.

Similarly, if the representative sample is too small, the results will not be statistically significant. The sample will not be truly representative of the population. The decisions based on such a study will be poor, thanks to the poor quality of data.

For a study with a population of 20,000, with a confidence level of 95%, a margin of error of 5%, and a population proportion of 50% the required sample size is 377. This means at least 377 or more surveys are required for the study to have a confidence level of 95% and that the real value is within ±5% of the surveyed value.

Suppose the population being surveyed mandates that both responses from men and women are required. Then in the representative sample the number of men surveyed should be 188 0r 189 and the number of women surveyed should also be 188 or 189. If 300 men and 77 women are served then the survey findings will not truly reflect the population.

Benefits of Representative Sampling

Key benefits of random sampling are:

  • Statistically significant results
  • Ease of research within the budget
  • Helps back up statements with survey findings extrapolated for the population
  • Allows for further research of sample subsets for a better understanding of the population
  • Helps deliver actionable plans to test products, delight customers and better product offerings

Representative sampling is a practical and useful tool to conduct statistical studies/research on the population under study without compromising on the quality of the study.

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book includes PDFs, explanations, instructions, data files, and R code for all examples.

Get the Bundle for $39 (Regular $57)
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book comes with PDFs, detailed explanations, step-by-step instructions, data files, and complete downloadable R code for all examples.