Encoding Categorical Data in Python pandas

Premium

We have three columns with categorical data: LoanStatus, LoanAmountCategory, and CustomerLoyalty. To demonstrate encoding, we will apply it to the LoanStatus column. Since the values in LoanStatus are nominal without any intrinsic order, one-hot encoding is the appropriate technique. It avoids any ordinal implications that label encoding might introduce.

Before we do this, let’s check the various values in this column to ensure that there are no discrepancies. The following code gets us the unique values in LoanStatus column.

Unlock Premium Content

Upgrade your account to access the full article, downloads, and exercises.

You'll get access to:

  • Access complete tutorials and examples
  • Download source code and resources
  • Follow along with practical exercises
  • Get in-depth explanations