ggplot2 - Chart Aesthetics and Position Adjustments in R

ggplot2 - Aestheitics

Generally when we talk about aesthetics, we talk about the attributes of a chart such as the color, size and shape. However, in ggplot2, it is not just about how something looks but also about how a variable is mapped it to. In our earlier examples, we mapped the Loan.Quality to Color. As a part of the Aesthetics layer, we map variables to aesthetics. This includes a variety of things such as the x-position, y-position, color, fill and so on. So, when we want to plot Credit Amount on y-axis, we are essentially mapping the variable Credit.Amount to y axis.

The following is a list of various aesthetics that we can specify.

Let's take an example and try out these aesthetics. We will use the German Credit Data and create a scatter plot with custom aesthetics.

  • Map Credit.amount on x-axis
  • Map Duration.of.Credit..in.months. on y-axis
  • Change shape to 21 (filled circle with outline). Use ?shape to learn about different shapes available.
  • Map the Loan.Quality to fill
  • Change the shape size to 3
  • Reduce alpha to 0.5
  • Add a plot title - "Loan Amount Vs Duration"
  • Change x-axis label to "Loan Amount"
  • Change y-axis label to "Duration in Months"

I suggest that you try doing this yourself in your R environment and if things don't work out, use the code provided below:

g <- ggplot(df,aes(x=Credit.amount,y=Duration.of.Credit..in.months.,fill=Loan.Quality))
g+geom_point(shape=21,size=3,alpha=0.5)

Let's plot one more graph and we will also make use of facets to split the data into multiple plots.

  • Map Job to x-axis
  • Map Credit.amount to y-axis
  • May Job to color so that each job category has a different color.
  • Add jitter to the data. Jittering refers to purposely adding noise to your data. This will be helpful here because on x-axis we have a category. If we simply plot the points they will all be in one line
  • Change alpha to 0.5
  • Change size to 3

The following code achieves this:

g <- ggplot(df,aes(x=Job,y=Credit.amount,color=Job))
g+geom_jitter(size=2,alpha=0.5)+
labs(title ="Loan Amount by Job", x = "Job", y = "Loan Amount")

The way ggplot2 is designed is that you can customize almost anything in the chart and achieve exactly what you want. 

ggplot2 - Position Adjustments

ggplot2 allows us to adjust the position of each geom. To do so, we simply have to specify the desired position to the position argument of the geom function. In the previous lesson, we saw jittering, which is an example of position adjustment of continuous data.

We have the following position adjustments available:

  • position_identity - This is the default for most geoms. This just means don't adjust position. So we are telling ggplot2 to plot the data as it is.
  • position_jitter - This allows us to add noise to the plot which may be hard to read because of multiple overlapping points. We can specify width and height as arguments - position_jitter(width = NULL, height = NULL)
  • position_dodge - Dodging preserves the vertical position of a geom while adjusting the horizontal position. Format: position_dodge(width = NULL, preserve = c("total", "single")).
  • position_stack - Stacks bars on to of each other. This is the default of geom_bar and geom_area
  • position_fill - stacks bars and standardizes each stack to have constant height

geom_bar in ggplot2

Let's learn about position adjustments using geom_bar in ggplot2. We will use our German Credit dataset.

Simple Bar Chart

We will start by plotting a simple bar chart with the borrower's Credit History on x-axis and the amount of loan taken on y-axis.

p <- ggplot(data=df, aes(x=Credit.history, y=Credit.amount)) +
  geom_bar(stat="identity")
p

This will produce the following chart:

Add Colors to Bars

To add colors to the bars, we can supply the x dimension to the fill argument (fill = Credit.history). ggplot2 will assign a color to each value in Credit.history and fill the bars with that color.

p <- ggplot(data=df, aes(x=Credit.history, y=Credit.amount, fill=Credit.history )) +
  geom_bar(stat="identity")
p

Stacked Bars

We can also group the data by a third variable such as Loan.Quality. By default it will split the data by that variable and plot a stacked bar chart. This is done by mapping the variable Loan.Quality to the fill scale. position_stack is the default argument for geom_bar

p <- ggplot(data=df, aes(x=Credit.history, y=Credit.amount, fill=Loan.Quality )) +
  geom_bar(stat="identity")
p

Dodging

If we did not want the bars to be stacked we can use position_dodge which will preserve the vertical position of the geom while adjusting the horizontal position.

p <- ggplot(data=df, aes(x=Credit.history, y=Credit.amount, fill=Loan.Quality )) +
  geom_bar(stat="identity",position=position_dodge())
p

100% Stacked Bar

We can use the argument position_fill to change it into a 100% stacked bar chart which is useful for relative comparison.

p <- ggplot(data=df, aes(x=Credit.history, y=Credit.amount, fill=Loan.Quality )) +
  geom_bar(stat="identity",position=position_fill())
p

Modifying Scales

In ggplot2, each aesthetic is a scale which we map our data on to. So, color is just a scale just like x and y axis. We can access and modify each scale using the "scale_" functions. For example, we can use scale_fill_manual to manually specify the colors to be used in the plot as shown in the plot below:

Related Downloads

Membership
Learn the skills required to excel in data science and data analytics covering R, Python, machine learning, and AI.
I WANT TO JOIN
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Saylient AI Logo

Take the Next Step in Your Data Career

Join our membership for lifetime unlimited access to all our data analytics and data science learning content and resources.