Installing and Loading data.table Package
Before we can use the functions in data.table
package, we need to install and load the package in R. We can do so using the install.packages()
and library()
command.
1> install.packages("data.table")
2> library(data.table)
3
Importing Data Using fread
Once the package is loaded, we can use the fread
function to read the data as shown below:
1> mydata <-fread("GS-Stock-Prices.txt")
2> mydata
3 Time Open High Low Last Volume
4 1: 1/24/2017 231.86 236.06 230.84 233.68 4448100
5 2: 1/23/2017 231.86 233.75 230.75 232.67 3136100
6 3: 1/20/2017 231.62 233.23 230.54 232.20 5211800
7 4: 1/19/2017 234.07 234.75 230.62 231.41 4561800
8 5: 1/18/2017 236.00 237.69 231.52 234.29 7590400
9 6: 1/17/2017 242.94 243.06 235.61 235.74 6277100
10 7: 1/13/2017 245.43 247.77 242.91 244.30 4186000
11 8: 1/12/2017 245.06 245.47 241.57 243.84 4022300
12 9: 1/11/2017 242.77 245.84 242.00 245.76 3532500
1310: 1/10/2017 240.87 243.44 239.05 242.57 3432900
1411: 1/9/2017 243.25 244.69 241.47 242.89 3022700
1512: 1/6/2017 242.29 246.20 241.37 244.90 3591000
1613: 1/5/2017 242.72 243.23 236.78 241.32 3562600
1714: 1/4/2017 241.44 243.32 240.03 243.13 2728700
1815: 1/3/2017 242.70 244.97 237.97 241.57 4384200
1916: 12/30/2016 238.51 240.50 237.40 239.45 2355500
2017: 12/29/2016 240.75 241.07 236.64 238.18 2619000
2118: 12/28/2016 243.69 244.50 240.44 240.65 3052900
2219: 12/27/2016 241.95 242.59 240.40 241.56 1998100
23>
24
fread - Drop and Select
The fread
command has two special arguments called drop
and select
which can be used to select or drop the variables/columns that we need to import.
In our dataset, we have six columns and we can use these arguments to select or drop the columns we want. Some examples below:
1# Drop columns 2 to 4. Import only Time, last and Volume
2fread("GS-Stock-Prices.txt", drop = 2:4)
3# Import only column 1 and 5, i.e., Time and Last price.
4fread("GS-Stock-Prices.txt", select = c(1, 5))
5# Drop 'Open', 'Last' and 'Volume' columns
6fread("GS-Stock-Prices.txt", drop = c("Open", "Last", "Volumn")
7#import only 'Time' and 'Last Price' columns
8fread("GS-Stock-Prices.txt", select = c("Time", "Last")
9