Data Visualization using pandas

Beyond simply listing counts and unique values, visualization can greatly aid in comprehending the categorical distribution. Plots can reveal outliers or data errors that aren't always obvious in tables.

We will plot the sector counts that we calculated earlier.

# Visualize the sector distribution
sector_counts.plot(kind='bar')

Let’s also create some visualizations on our stocks data. We can plot a time series of closing stock prices of Qualcomm using the following line of code:

# Plotting the closing stock prices
qcom_df['Close'].plot(title='Historical Closing Prices')

This line chart tells us so much more about the fluctuations in the stock’s daily prices. We can also see that the stock is currently on an upward trend. We can also create a histogram of the stock volume.

# Plotting the closing stock prices
qcom_df['Volume'].hist(bins=50)

A histogram, for example, can immediately show the distribution of trading volumes. Let’s infer some insights from it.

  • Central Tendency: The bulk of the data seems to cluster on the left side of the histogram, suggesting that on most days, the trading volume is lower rather than higher. Specifically, the trading volume frequently seems to fall below the 0.5 x 10^7 mark (which is 5 million if we interpret the scientific notation correctly).

  • Skewness: The distribution is right-skewed, with a tail extending towards the higher trading volumes. This indicates that there are days with unusually high trading volumes, but these are less frequent.

  • Outliers: The bars on the far right, separated from the cluster of other bars, suggest that there have been days with particularly high trading volumes that are outliers when compared to the typical trading volume.

  • Volatility Indication: Days with exceptionally high trading volume can be associated with significant news or events affecting the company, such as earnings reports, product announcements, or broader market volatility.

  • Liquidity: The consistent presence of bars—even if small—across the volume range suggests that Qualcomm's stock has a liquid market with transactions occurring at various volume levels.

Volume Peaks: There are noticeable peaks within the distribution, particularly in the lower volume range. These peaks may represent common volume levels at which trades typically consolidate.

In this section, we've completed our introductory exploration of pandas. We began by introducing pandas and its role in data analysis. This was followed by a discussion on how to install pandas and set up your development environment. We then examined the basic data structures in pandas, namely Series and DataFrame, and explored how to load and save data using these structures. Finally, we touched on basic techniques for exploring your data, including how to generate summary statistics, and create basic visualizations.

In the next section, we will learn about how to manipulate data using pandas, starting with the data cleaning and preprocessing techniques.

Related Downloads

Membership
Learn the skills required to excel in data science and data analytics covering R, Python, machine learning, and AI.
I WANT TO JOIN
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Saylient AI Logo

Take the Next Step in Your Data Career

Join our membership for lifetime unlimited access to all our data analytics and data science learning content and resources.