R comes with many built-in datasets which are quite useful while learning R. To begin learning the basics of data visualization in R, we will make use of some of these datasets.
Datasets in the ‘datasets’ package
Many datasets are included in a package called datasets
which is distributed with R so these datasets are instantly available to you for use. For example, two datasets namely cars
and pressure
are included in this default datasets package. So, you can access their data by using functions such as head(cars)
, summary(cars)
, etc. The following examples show results of calls to these functions:
> head(cars)
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
>
> summary(cars)
speed dist
Min. : 4.0 Min. : 2.00
1st Qu.:12.0 1st Qu.: 26.00
Median :15.0 Median : 36.00
Mean :15.4 Mean : 42.98
3rd Qu.:19.0 3rd Qu.: 56.00
Max. :25.0 Max. :120.00
>
> head(pressure)
temperature pressure
1 0 0.0002
2 20 0.0012
3 40 0.0060
4 60 0.0300
5 80 0.0900
6 100 0.2700
>
> summary(pressure)
temperature pressure
Min. : 0 Min. : 0.0002
1st Qu.: 90 1st Qu.: 0.1800
Median :180 Median : 8.8000
Mean :180 Mean :124.3367
3rd Qu.:270 3rd Qu.:126.5000
Max. :360 Max. :806.0000
>
To learn more about a dataset, you can use the help function as help(cars)
.
If you want to get a list of all the datasets, you can do so using the data()
function.
Datasets in Other Packages
Any R package can choose to include datasets. You can access the data from a package using thedata()
function by using the package argument as follows:
data(datasetname, package="packagename")
For example, there’s a popular package called MASS
which contains datasets (such as Cars93). We can access the Cars93 dataset by calling the data()
function.
> data(Cars93, package="MASS")
After this call to data()
, the Cars93 dataset is available for use in R.
Leave a Reply