Creating and Using Vectors in R
In R, a vector is a series of data elements of the same data type, for example, a series of numbers, a series of characters, or a series of logical values. In finance, for example, the daily profit/loss from your stock investments could be represented as a vector. A vector is also referred to as a one-dimensional array.
An important point to remember is that within a vector, all the data elements must be of the same type, i.e., you cannot mix different data types in the same vector.
In R, you can create a vector using the combine()
function. For example, here is a vector containing three numeric values 5
, 7
and 8
.
> c(5,7,8)
[1] 5 7 8
>
Let's say John and Ivan, two colleagues engage in day trading for a week and want to analyze their daily performance. Their daily profit and loss from the trading activity is shown below:
We can use this data in R by creating vectors and assigning them to variables.
#John's earnings in the week
john_earnings <- c(120,-40,25,-130,260)
#Ivan's earnings in the week.
ivan_earnings <- c(-30,-60,80,-250,200)
#Print John's earnings
john_earnings
#Print Ivan's earnings
ivan_earnings
Run this script in RStudio (CTRL+ENTER) and you will see the results in RConsole.
> #John's earnings in the week
> john_earnings <- c(120,-40,25,-130,260)
>
> #Ivan's earnings in the week.
> ivan_earnings <- c(-30,-60,80,-250,200)
>
> #Print John's earnings
> john_earnings
[1] 120 -40 25 -130 260
>
> #Print Ivan's earnings
> ivan_earnings
[1] 120 -40 25 -130 260
>
Now that we have the data, we can start working with it to measure performance.
Vector Operations
We will use this earnings data to understand how to perform some important arithmetic operations on vectors.
Combined Earnings Per Day
We can calculate the combined earnings of both John and Ivan by just adding the two vectors. When you add two vectors, the addition is performed member-wise (i.e., each element in the vector is added to the element on the same index in the other vector.
#John's earnings in the week
john_earnings <- c(120,-40,25,-130,260)
#Ivan's earnings in the week.
ivan_earnings <- c(-30,-60,80,-250,200)
#Calculate total earnings
total_earnings_per_day = john_earnings + ivan_earnings
#Print total earnings
total_earnings_per_day
Run this script in your computer and compare the results with the ones below:
> #Print Ivan's earnings
> total_earnings_per_day
[1] 90 -100 105 -380 460
As you can see, earnings for each day are added and assigned to the new variable.
Total Earnings
The next thing we want to know is - "What were their total earnings for the whole week?"
We can claculate thus using the Sum()
function in R which calculates the sum of all elements of a vector.
#John's earnings in the week
john_earnings <- c(120,-40,25,-130,260)
#Ivan's earnings in the week.
ivan_earnings <- c(-30,-60,80,-250,200)
#Calculate John's total earnings for the week
john_total <- sum(john_earnings)
#Calculate Ivan's total earnings for the week
ivan_total <- sum(ivan_earnings)
# calculate total combined earnings for the week
total_earnings <- john_total + ivan_total
#Print individual totals and combined total
john_total
ivan_total
total_earnings
Run this script in your computer and compare the results with the ones below:
> #Print individual totals and combined total
> john_total
[1] 235
> ivan_total
[1] -60
> total_earnings
[1] 175
>
Compare Performance
By visually analyzing the above numbers we already know that John has performed much better than Ivan. However, when we deal with large datasets, such visual scans will not be possible. So, we can give the job of comparing performance to R script.
In the following script, we compare performance as a daily level as well as total. To compare the daily performance, we just have to compare the two vectors. For example, we can check "Did John perform better than Ivan each day?". To do so, we can use the logical operator to compare the two vectors. R will automatically compare each element in the vector and return a boolen value.
#John's earnings in the week
john_earnings <- c(120,-40,25,-130,260)
#Ivan's earnings in the week.
ivan_earnings <- c(-30,-60,80,-250,200)
#Check if John performed better than Ivan each day
john_vs_ivan <- john_earnings > ivan_earnings
#Calculate John's total earnings for the week
john_total <- sum(john_earnings)
#Calculate Ivan's total earnings for the week
ivan_total <- sum(ivan_earnings)
#Check if John performed better than Ivan overall
overall_perfromance <- john_total > ivan_total
#Print John's performance vis-a-vis Ivan's each day
john_vs_ivan
#Print overall performance
overall_perfromance
The results should match our visual understanding of the numbers:
> #Print John's performance vis-a-vis Ivan's each day
> john_vs_ivan
[1] TRUE TRUE FALSE TRUE TRUE
>
> #Print overall performance
> overall_perfromance
[1] TRUE
>
Other Arithmetic Operations
Just like addition we can perform all arithmetic operations on vectors such as subtraction, multiplication, and division. The following example shows the difference between John's and Ivan's earnings each day.
#John's earnings in the week
john_earnings <- c(120,-40,25,-130,260)
#Ivan's earnings in the week.
ivan_earnings <- c(-30,-60,80,-250,200)
#Calculate difference in earnings
difference <- john_earnings - ivan_earnings
#Print the difference
difference
The results will be as follows:
> #Print the difference
> difference
[1] 150 20 -55 120 60
>
Vector Index
We can retrieve individual values from a vector by referring to its index. For example, you can select the first element in the vector by typing vector_name[1]
. Similarly, you can select multiple values from the vector by using the notations demonstrated below:
> #John's earnings in the week
> john_earnings <- c(120,-40,25,-130,260)
>
> #Ivan's earnings in the week.
> ivan_earnings <- c(-30,-60,80,-250,200)
>
> #Select John's earnings on 3rd day
> john_earnings[3]
[1] 25
>
> #Select John's earnings on 2nd and 5th day
> #You can select multiple values by supplying a vector of indexes
>
> john_earnings[c(2,5)]
[1] -40 260
>
> #Select John's earnings from 2nd to 4th day
> #To produce a vector slice between two indexes,
> #you can use the colon operator ":". Helpful for large vectors.
>
> john_earnings[2:4]
[1] -40 25 -130
>
Naming Vector Members
We learned how to refer to the elements of the vector but it's still difficult to tell which earnings belong to which day. In such situations, we can assign names to the members of the vectors. In our case, we can assign the names of days to the elements. We can do so by using the names()
function as shown below:
#John's earnings in the week
john_earnings <- c(120,-40,25,-130,260)
#Ivan's earnings in the week.
ivan_earnings <- c(-30,-60,80,-250,200)
#Assign the names of the day to John's Earnings vector
names(john_earnings) <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
#If you're applying these names to multiple vectors, you can even
#create a new vector with these names and then assign it.
days_in_week <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
#Assign the names of the day to Ivan's Earnings vector
names(ivan_earnings) <- days_in_week
#Print John's and Ivan's earnings
john_earnings
ivan_earnings
Results will now look like below:
> #Print John's and Ivan's earnings
> john_earnings
Monday Tuesday Wednesday Thursday Friday
120 -40 25 -130 260
> ivan_earnings
Monday Tuesday Wednesday Thursday Friday
-30 -60 80 -250 200
>
Select Only Positive Earnings
Let's learn a few more things about vectors using our earnings example. Let's say we want to create a new variable, which contains only the positive earnings. In this case we will use the combined earnings vector for John and Ivan.
Step 1: Create a logical vector that tells us on which days we have positive earnings
#John's earnings in the week
john_earnings <- c(120,-40,25,-130,260)
#Ivan's earnings in the week.
ivan_earnings <- c(-30,-60,80,-250,200)
#Total Earnings
total_earnings <- john_earnings + ivan_earnings
#Vector for days in week
days_in_week <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
#Assign the names of the day to Total Earnings vector
names(total_earnings) <- days_in_week
#On which days did we have positive earnings
#To do so we use comparison operators on vectors. The comparison operator
#will compare each element in the vector if the condition stated by the
#comparison operator is TRUE or FALSE
positive_vector <- total_earnings > 0
#Print the positive_vector
positive_vector
The positive vector should look like this:
Step 2 is to use the positive selection vector to slice the positive earnings from total_earnings
vector.
> #Select the positive earing days from the total_earnings vector
> positive_earning_days <- total_earnings[positive_vector]
>
> #Print the positive earnings days
> positive_earning_days
Monday Wednesday Friday
90 105 460
>
In the second step, we used the logical index vector to slice our earnings factor. If the logical value is TRUE, the member is included in the resulting slice, otherwise not.
Data Science in Finance: 9-Book Bundle
Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.
What's Included:
- Getting Started with R
- R Programming for Data Science
- Data Visualization with R
- Financial Time Series Analysis with R
- Quantitative Trading Strategies with R
- Derivatives with R
- Credit Risk Modelling With R
- Python for Data Science
- Machine Learning in Finance using Python
Each book includes PDFs, explanations, instructions, data files, and R code for all examples.
Get the Bundle for $39 (Regular $57)Free Guides - Getting Started with R and Python
Enter your name and email address below and we will email you the guides for R programming and Python.