Exploring Data using pandas

When you first load your data, it's important to perform initial checks to understand its structure, content, and the type of data it contains.

Viewing Data

Here's how you can take a peek at your DataFrame:

1# Display the first five rows of the stocks DataFrame
2print(stocks_df.head())
3
4# Display the last five rows of the sDataFrame
5print(stocks_df.tail())
6
7

financials_df.head() displays the first few rows and can immediately flag missing data or anomalies.

financials_df.tail() shows you the end of the dataset, often revealing how recent the data is and whether it's been truncated.

Data Structure

An understanding of your DataFrame's structure is essential before diving into deeper analysis.

We can use stocks_df.info() to get a summary of the DataFrame, including the number of non-null entries and data types of each column. This can highlight if certain columns contain missing values that need to be addressed or if data types need conversion.

1# Print a concise summary of the DataFrame
2print(stocks_df.info())
3
4

Descriptive Statistics

Descriptive statistics provide a high-level summary of the attributes of your dataset

stocks_df.describe() gives a statistical summary for numerical columns, useful for a quick assessment of distribution and variability.
Custom aggregations like stocks_df [' GOOGL '].mean() help in understanding specific aspects like the average.

1# Get a statistical summary
2print(stocks_df.describe())
3
4# Find the average price for Google
5print(stocks_df['GOOGL'].mean())
6
7

Aggregation

For more specific summary statistics, you can use aggregation methods like mean(), median(), min(), max(), and sum():

1# Calculate the average opening price
2print(stocks_df['MSFT'].mean())
3
4# Find the maximum closing price
5print(stocks_df['MSFT'].max())
6
7

The result The result will be 61.96290836653386 and 72.52.

Learn

Resources

Exploring Data using pandas

Viewing Data

Data Structure

Descriptive Statistics

Aggregation

Loading and Saving Data using Pandas

Correlation Analysis using pandas

Data Manipulation Using Pandas - Part 1

Data Science for Finance Bundle

Topics