- Relational Operators in R
- Logical Operators in R
- Conditional Statements in R
- For Loop in R Programming
- While and Repeat Loop in R Programming
- Functions in R Programming
- Creating Functions in R
- Apply Functions in R
- Importing Data from External Data Sources in R
- Importing Data Using read.csv in R
- Import Data using read.table in R
- Importing Data Using data.table – fread in R
- Importing Data from Excel in R
- Using XLConnect in R Programming
- Importing Data from a Database in R
- SQL Queries from R
- Importing Data from Web in R
Importing Data Using data.table – fread in R
R has a data manipulation package called data.table()
which is extensively used for data manipulation. Specially the package is very useful as a data cleaning tool for big data.
The data.table
package comes with a function called fread
which is a very efficient and speedy function for reading data from files. It is similar to read.table
but faster and more convenient. The good thing is that it detects column types (colClasses) and separators (sep) automatically, however you can always specify them manually. Similarly, it can automatically detect the header names and apply to the columns. If the headers are not found, it will conveniently name them automatically.
Installing and Loading data.table Package
Before we can use the functions in data.table
package, we need to install and load the package in R. We can do so using the install.packages()
and library()
command.
> install.packages("data.table")
> library(data.table)
Importing Data Using fread
Once the package is loaded, we can use the fread
function to read the data as shown below:
> mydata <-fread("GS-Stock-Prices.txt")
> mydata
Time Open High Low Last Volume
1: 1/24/2017 231.86 236.06 230.84 233.68 4448100
2: 1/23/2017 231.86 233.75 230.75 232.67 3136100
3: 1/20/2017 231.62 233.23 230.54 232.20 5211800
4: 1/19/2017 234.07 234.75 230.62 231.41 4561800
5: 1/18/2017 236.00 237.69 231.52 234.29 7590400
6: 1/17/2017 242.94 243.06 235.61 235.74 6277100
7: 1/13/2017 245.43 247.77 242.91 244.30 4186000
8: 1/12/2017 245.06 245.47 241.57 243.84 4022300
9: 1/11/2017 242.77 245.84 242.00 245.76 3532500
10: 1/10/2017 240.87 243.44 239.05 242.57 3432900
11: 1/9/2017 243.25 244.69 241.47 242.89 3022700
12: 1/6/2017 242.29 246.20 241.37 244.90 3591000
13: 1/5/2017 242.72 243.23 236.78 241.32 3562600
14: 1/4/2017 241.44 243.32 240.03 243.13 2728700
15: 1/3/2017 242.70 244.97 237.97 241.57 4384200
16: 12/30/2016 238.51 240.50 237.40 239.45 2355500
17: 12/29/2016 240.75 241.07 236.64 238.18 2619000
18: 12/28/2016 243.69 244.50 240.44 240.65 3052900
19: 12/27/2016 241.95 242.59 240.40 241.56 1998100
>
fread - Drop and Select
The fread
command has two special arguments called drop
and select
which can be used to select or drop the variables/columns that we need to import.
In our dataset, we have six columns and we can use these arguments to select or drop the columns we want. Some examples below:
# Drop columns 2 to 4. Import only Time, last and Volume
fread("GS-Stock-Prices.txt", drop = 2:4)
# Import only column 1 and 5, i.e., Time and Last price.
fread("GS-Stock-Prices.txt", select = c(1, 5))
# Drop 'Open', 'Last' and 'Volume' columns
fread("GS-Stock-Prices.txt", drop = c("Open", "Last", "Volumn")
#import only 'Time' and 'Last Price' columns
fread("GS-Stock-Prices.txt", select = c("Time", "Last")
Lesson Resources
You may find these interesting
Free Guides - Getting Started with R and Python
Enter your name and email address below and we will email you the guides for R programming and Python.