stat_summary for Statistical Summary in ggplot2 R

stat_summary is a unique statistical function and allows a lot of flexibility in terms of specifying the summary. Using this, you can add a variety of summary on your plots. For example, in a bar chart, you can plot the bars based on a summary statistic such as mean or median. Similarly, stat_summary() can be used to add mean/median points to a dot plot.

stat_summary() takes a few different arguments.

  • fun.y: A function to produce y aesthetics
  • fun.ymax: A function to produce ymax aesthetics
  • fun.ymin: A function to produce ymin aesthetics
  • fun.data: A function to produce a named vector of aesthetics

We can pass a function to each of these arguments, and ggplot2 will use the value returned by that function for the corresponding aesthetic. If you pass a function to fun.data, you can compute many summary statistics and return them as a vector, where each element in the vector is named for the aesthetic it should be used for.

Let's understand this with two examples:

Bar Chart with Median Values

We will use the stock_prices.tidy dataframe we created earlier to plot a bar chart with the stock symbols on the x-axis and the median stock price for each stock on y-axis. We can achieve this using the stat_summary() function as follows:

ggplot(stock_prices.tidy,aes(x=Symbol,y=Prices,fill=Symbol))+
  stat_summary(fun.y = median, geom = "bar")

Quartile Points

Following is another example where we plot quartile points for each stock. We first create a new function to calculate the quartile and then supply that function as argument to fun.data in stat_summary().

median.quartile <- function(x){
  out <- quantile(x, probs = c(0.25,0.5,0.75))
  names(out) <- c("ymin","y","ymax")
  return(out)
}
ggplot(stock_prices.tidy, aes(x=Symbol,y=Prices,col=Symbol)) +
  stat_summary(fun.data = median.quartile, geom = "pointrange")

Related Downloads

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book includes PDFs, explanations, instructions, data files, and R code for all examples.

Get the Bundle for $29 (Regular $57)
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book comes with PDFs, detailed explanations, step-by-step instructions, data files, and complete downloadable R code for all examples.