Data Visualization using pandas

Beyond simply listing counts and unique values, visualization can greatly aid in comprehending the categorical distribution. Plots can reveal outliers or data errors that aren't always obvious in tables.

We will plot the sector counts that we calculated earlier.

1# Visualize the sector distribution
2sector_counts.plot(kind='bar')
3

Let’s also create some visualizations on our stocks data. We can plot a time series of closing stock prices of Qualcomm using the following line of code:

1# Plotting the closing stock prices
2qcom_df['Close'].plot(title='Historical Closing Prices')
3
4

This line chart tells us so much more about the fluctuations in the stock’s daily prices. We can also see that the stock is currently on an upward trend. We can also create a histogram of the stock volume.

1# Plotting the closing stock prices
2qcom_df['Volume'].hist(bins=50)
3
4

A histogram, for example, can immediately show the distribution of trading volumes. Let’s infer some insights from it.

Central Tendency: The bulk of the data seems to cluster on the left side of the histogram, suggesting that on most days, the trading volume is lower rather than higher. Specifically, the trading volume frequently seems to fall below the 0.5 x 10^7 mark (which is 5 million if we interpret the scientific notation correctly).
Skewness: The distribution is right-skewed, with a tail extending towards the higher trading volumes. This indicates that there are days with unusually high trading volumes, but these are less frequent.
Outliers: The bars on the far right, separated from the cluster of other bars, suggest that there have been days with particularly high trading volumes that are outliers when compared to the typical trading volume.
Volatility Indication: Days with exceptionally high trading volume can be associated with significant news or events affecting the company, such as earnings reports, product announcements, or broader market volatility.
Liquidity: The consistent presence of bars—even if small—across the volume range suggests that Qualcomm's stock has a liquid market with transactions occurring at various volume levels.

Volume Peaks: There are noticeable peaks within the distribution, particularly in the lower volume range. These peaks may represent common volume levels at which trades typically consolidate.

In this section, we've completed our introductory exploration of pandas. We began by introducing pandas and its role in data analysis. This was followed by a discussion on how to install pandas and set up your development environment. We then examined the basic data structures in pandas, namely Series and DataFrame, and explored how to load and save data using these structures. Finally, we touched on basic techniques for exploring your data, including how to generate summary statistics, and create basic visualizations.

In the next section, we will learn about how to manipulate data using pandas, starting with the data cleaning and preprocessing techniques.

Learn

Resources

Data Visualization using pandas

Handling Categorical Data and Unique Values using pandas

Handling Missing Data in Python

Data Manipulation Using Pandas - Part 1

Data Science for Finance Bundle

Topics