Box & Whisker Plots (in Seaborn)

Hoda Saiful
3 min readJul 4, 2021

Provides 5 critical estimators of any given distribution. Also, referred to as a 5 point summary plot or a Box plot.

Import Seaborn and download a Dataset

Tips dataset captures the information from a restaurant, the values in columns e.g. total_bill represents the bill rendered by a table; tip implies tip rendered, sex implies whether the person is a male of a female, smoker implies whether the person is a smoker, day implies day of the week, time implies time of the day and size implies the number of people. Tips data-se comprises of 244 rows and 7 columns .

Relationship between day and total bill for the entire range of data using a box and whisker plot.

There is a lot of information, so let us break this down.

Plot the data for a single day. For ease of explanation, I pick the day with least number of customers visiting the restaurant i.e Friday.

Summary statistics of ‘Total Bill ’ column for Friday.

Plot a Box and Whisker plot for the above data

Break down the above plot

Tally and map the data points .

How are outliers calculated

The ceiling and floor calculation , beyond which data points are treated as an outlier is as below

Outliers on the higher side Q3 + (1.5 * IQR),

Outliers on the lower side is Q1 — (1.5*IQR).

IQR stands for Inter quartile range and is the difference between the Q3 & Q1 values. In this case 21.75–12.095 = 9.655. Any value beyond (21.75 + 9.655) = 31.405 is treated as an outlier.

40.17 is way way way far away from 31.04, and is treated as an outlier.

Split the plot by categorical Column

The “hue” argument splits the plot by categorical values. We now have 2 Box and Whisker plots one for each gender — male and female

--

--