Working with NULL values in a Pandas Data Frame

Hoda Saiful
2 min readDec 1, 2020

--

To get a true sense of numerical data, it is important to understand NULL values present in a dataset.

To get a true sense of numerical data, it is important to understand NULL values present in a dataset, and ways to action upon NULL items. In this post, we discuss ways of discovering and acting upon NULL data points in a Pandas data frame.

You may download the code, from the link

Dataset

This is a a very small dataset, with just 8 rows and as is visible, there are 3 NULL values in the SAL column of the dataset EPLREC

Understand NULL values in a Dataset

Before starting to work on a Dataset, it important to ensure there are no NULL data points. Inbuilt functions in pandas make the process of discovering NULL data points easy and a joy to work with.

Action NULL Data points

Once, we understand of the NULL values in a dataset, the occurrence of NULL values relative to the size of the Dataset and the problem statement at hand, we can then make an informed decision to act upon the NULL values

Populate missing values with a Custom Formula

In certain scenarios where the number of NULL values is significantly higher, dropping rows/columns may not be the best approach. Python provides us with ways to write custom formula for populating NULL Values.

In the case below, the Salary column has 3/8 NULL values, so I prefer to populate the missing values with the mean of available data points in the SALARY column .

--

--

Hoda Saiful
Hoda Saiful

Written by Hoda Saiful

When I have the time to write

No responses yet