Handling Missing Data - Example - Part 1
In the previous lesson, we learned about various strategies to handle missing data. Let’s now work through our example to handle missing values. Here’s our missing data report:
Let’s work through each column containing missing data.
Loan Amount
This is a numerical field and an important one. We can delete all rows which have a loan amount missing. Alternatively, the missing values can be filled with either the mean or median of the existing loan amounts. The choice between mean and median depends on whether the data is skewed or not. If there are outliers (very high or very low loan amounts), median is a better choice as it is less affected by outliers. Below we show how you can delete rows with missing values or fill the missing values using Median. You can continue with either approach in your analysis.
Unlock Premium Content
Upgrade your account to access the full article, downloads, and exercises.
You'll get access to:
- Access complete tutorials and examples
- Download source code and resources
- Follow along with practical exercises
- Get in-depth explanations