Accessing Built-in Datasets in R
R comes with many built-in datasets which are quite useful while learning R. To begin learning the basics of data visualization in R, we will make use of some of these datasets.
Datasets in the 'datasets' package
Many datasets are included in a package called datasets
which is distributed with R so these datasets are instantly available to you for use. For example, two datasets namely cars
and pressure
are included in this default datasets package. So, you can access their data by using functions such as head(cars)
, summary(cars)
, etc.
The following examples show results of calls to these functions:
1> head(cars)
2 speed dist
31 4 2
42 4 10
53 7 4
64 7 22
75 8 16
86 9 10
9>
10
1> summary(cars)
2 speed dist
3 Min. : 4.0 Min. : 2.00
4 1st Qu.:12.0 1st Qu.: 26.00
5 Median :15.0 Median : 36.00
6 Mean :15.4 Mean : 42.98
7 3rd Qu.:19.0 3rd Qu.: 56.00
8 Max. :25.0 Max. :120.00
9>
10
1> head(pressure)
2 temperature pressure
31 0 0.0002
42 20 0.0012
53 40 0.0060
64 60 0.0300
75 80 0.0900
86 100 0.2700
9>
10
1> summary(pressure)
2 temperature pressure
3 Min. : 0 Min. : 0.0002
4 1st Qu.: 90 1st Qu.: 0.1800
5 Median :180 Median : 8.8000
6 Mean :180 Mean :124.3367
7 3rd Qu.:270 3rd Qu.:126.5000
8 Max. :360 Max. :806.0000
9>
10
To learn more about a dataset, you can use the help function as help(cars)
.
If you want to get a list of all the datasets, you can do so using the data()
function.
Datasets in Other Packages
Any R package can choose to include datasets. You can access the data from a package using thedata()
function by using the package argument as follows:
data(datasetname, package="packagename")
For example, there's a popular package called MASS
which contains datasets (such as Cars93). We can access the Cars93 dataset by calling the data()
function.
> data(Cars93, package="MASS")
After this call to data()
, the Cars93 dataset is available for use in R.