German Credit Data : Data Preprocessing and Feature Selection in R

Premium

The purpose of preprocessing is to make your raw data suitable for the data science algorithms. For example, we may want to remove the outliers, remove or change imputations (missing values, and so on).

The dataset that we have selected does not have any missing data. But, in real time there is possibility that the dataset has many missing or imputed data which needs to be replaced with valid data generated by making use of the available complete data. The k-nearest neighbours algorithm is used for this purpose to perform multiple imputation.

Unlock Premium Content

Upgrade your account to access the full article, downloads, and exercises.

You'll get access to:

  • Access complete tutorials and examples
  • Download source code and resources
  • Follow along with practical exercises
  • Get in-depth explanations