In R, we can create a bar plot using the
barplot() function. A Bar Plot or Bar Graph is primarily used to compare values. It presents grouped data using rectangular bars whose lengths are proportional to the values that they represent. Let’s take our Product Sales data where we have the Revenue and Gross Margin for each order along with various attributes such as ProductLine, RetailerType, OrderMethod etc. We can use this data to create a bar chart which plots the total or average sales on y-axis as bars and the x-axis as one of these factors that we are interested in such as ProductLine.
The first step is to group the Total Sales by ProductLine using the
tapply() function as shown below:
> sales_by_productline <- tapply(sales$Revenue, sales$ProductLine, sum)
This gives us the sales to be plotted as bars. We can now plot the chart using the
The above function call will create the bar chart. We can enhance the chart by adding a title and axis labels.
> barplot(sales_by_productline, main="Sales by ProductLine", xlab="ProductLine",ylab="$")
The resulting bar plot is displayed below:
Adding Colors to Bars
We can add different colors to the bars in the bar plot by adding the
col argument. The
col argument is basically a vector of colors. R has some inbuilt functions to generate vectors of colors, for example, the
gray function generates a vector of grays. Similarly the
rainbow function generates a vector of rainbow colors. Alternatively you can explicitly supply a vector containing color codes. In the following example, we extend our bar plot by painting the bars with rainbow colors:
> barplot(sales_by_productline, col=rainbow(5), main="Sales by ProductLine", xlab="ProductLine",ylab="$")
The graph will now look as follows:
To supply explicit colors, you would pass the col argument as
col = c("Red", "Green", "Blue", ...). Alternatively, you can also specify the hex codes for the colors.
The base graphics library provides only basic features for plotting charts. For example, in the above bar chart, we could have plotted mean revenue values instead of totals. That would have worked fine. However, suppose we also wanted to plot max and min sales for each product line as markers above and below the mean bars. That would be very complicated to achieve here. However, the same thing could be easily done with other plotting libraries such as
gplots. We will learn about the
gplots library in a separate course.
- Load the Product Sales data in your R environment
- Create a bar plot containing the Average Gross Margin, grouped by RetailerType
- Color by bars with various shades of gray according to their rank, shorter bars being lighter colors and taller bars being darker colors.