This lesson requires a premium membership to access.
Premium membership includes unlimited access to all courses, quizzes, downloadable resources, and future content updates.
Ask questions about this lesson and get instant answers.
Categorical variables are those that represent a qualitative property of the data element, such as sector or industry in a financial context. Understanding how these non-numerical data are distributed is important, as they often hold key insights into the structure and segmentation of your dataset.
To do some analysis on categorical data, we will load a new data set (sp_500_constituents.csv). It’s a list of S&P 500 companies and contains these companies’ sector and industry.