- Overview of Data Visualization
- When to Use Bar Chart, Column Chart, and Area Chart
- What is Line Chart and When to Use It
- What are Pie Chart and Donut Chart and When to Use Them
- How to Read Scatter Chart and Bubble Chart
- What is a Box Plot and How to Read It
- Understanding Japanese Candlestick Charts and OHLC Charts
- Understanding Treemap, Heatmap and Other Map Charts
- Visualization in Data Science
- Graphic Systems in R
- Accessing Built-in Datasets in R
- How to Create a Scatter Plot in R
- Create a Scatter Plot in R with Multiple Groups
- Creating a Bar Chart in R
- Creating a Line Chart in R
- Plotting Multiple Datasets on One Chart in R
- Adding Details and Features to R Plots
- Introduction to ggplot2
- Grammar of Graphics in ggplot
- Data Import and Basic Manipulation in R - German Credit Dataset
- Create ggplot Graph with German Credit Data in R
- Splitting Plots with Facets in ggplots
- ggplot2 - Chart Aesthetics and Position Adjustments in R
- Creating a Line Chart in ggplot 2 in R
- Add a Statistical Layer on Line Chart in ggplot2
- stat_summary for Statistical Summary in ggplot2 R
- Facets for ggplot2 Charts in R (Faceting Layer)
- Coordinates in ggplot2 in R
- Changing Themes (Look and Feel) in ggplot2 in R
Plotting Multiple Datasets on One Chart in R
It's a common scenario to plot multiple datasets together on a single graph. For example, we may want to plot the daily returns from multiple stocks on a single chart to understand how they trend vis-a-vis each other. Similarly, we may want to plot multiple normal distribution curves with different mean and standard deviations.
To plot multiple datasets, we first draw a graph with a single dataset using the plot()
function. Then we add the second data set using the points()
or lines()
function.
Let's learn this with the help of an example where we will plot multiple normal distribution curves.
Generate x-axis data
First we will generate data for x-axis which will be a sequence of 200 evenly spaced numbers ranging from -5 to 5. We can do this using the seq()
function in R.
> x<-seq(-5,5,length=200)
Calculate Values for Normal Distribution
We can do this in two ways: 1) Generate random numbers using rnorm()
and then apply the density()
function to the data. 2) Alternatively we can do this directly using the dnorm()
function which gives the density of the distribution function. In more simple terms, this function gives height of the probability distribution at each point for a given mean and standard deviation. For our purpose, we will generate multiple datasets with different means and standard deviations.
> y1<-dnorm(x,mean=0,sd=0.2)
> y2=dnorm(x,mean=2,sd=0.5)
> y3<-dnorm(x,mean=-2,sd=0.8)
Combine Datasets
We can also combine all the data into a single dataframe (optional).
> data<-data.frame(x,y1,y2,y3)
Plot the First Curve
> plot(data$x,data$y1,type="l",main="Normal Distribution",xlab="x",ylab="y")
The plot looks as follows:
Normal Distribution
Add Lines for the Second Normal Density
We can now add the lines for the second and third density using the lines()
function.
> lines(data$x,data$y2,lty=2,lwd=2,col="green")
> lines(data$x,data$y3,lty=3,lwd=2,col="blue")
The resulting graph is displayed below:
Note that if we were plotting just the scatter graph without lines, we could add more data points to it using the points()
function instead of the lines()
function.
Setting Canvas Size
Sometimes when we want to add multiple datasets to a single plot, it is important to correctly specify the size of the canvas. Let's say the first dataset that you plot has an x-value range of 0 to 100. Once this is plotted, the graph will draw the x-axis with the 0-100 range. However, assume now that the second dataset that you want to plot has x values ranging from 0 to 200. Since the initial plot doesn't consider this, the points from the second dataset will be plotted off the chart and will be cut-off. To correct this problem, we need to set the coordinates for the graph in the beginning itself. This can be done using the xlim and ylim arguments.
Suppose we want to plot two datasets (x1,y1) and (x2,y2). We can compute the limits using the range function and then set them using xlim and ylim.
> xlim <- range(c(x1,x2))
> ylim <- range(c(y1,y2))
> plot(x1, y1, type="l", xlim=xlim, ylim=ylim)
> lines(x2, y2, lty="dashed")
Related Downloads
Data Science in Finance: 9-Book Bundle
Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.
What's Included:
- Getting Started with R
- R Programming for Data Science
- Data Visualization with R
- Financial Time Series Analysis with R
- Quantitative Trading Strategies with R
- Derivatives with R
- Credit Risk Modelling With R
- Python for Data Science
- Machine Learning in Finance using Python
Each book includes PDFs, explanations, instructions, data files, and R code for all examples.
Get the Bundle for $29 (Regular $57)Free Guides - Getting Started with R and Python
Enter your name and email address below and we will email you the guides for R programming and Python.