How to get summary statistics of a Pandas Dataframe in Python?

Do Pandas provide methods or functions which can allow me to get a summary or statistics of all the columns I have in the dataframe or do I have to do that separately for each column? I believe
summary statistics can help get an insight into all the columns and the overall data before diving deep into the analysis. You can use the following sample dataframe I have and apply the methods to this if such methods are available:

Yes, you can get summary statistics of Pandas data frame by using describe method. Here’s how it’s done:

  • The df.describe() method is used to generate a statistical summary of the data frame. This method returns a new data frame containing statistics such as count, mean, standard deviation, minimum, and maximum values for each column.
  • By default, df.describe() only includes columns with numeric data types, but it can also be used to include non-numeric columns by using the include parameter.
  • We have used this method to include all columns, and there would be many NaN values for non-numerical columns because statistics like mean, standard deviation, and percentiles only work for numerical columns.

@mubashir_rizvi you can check this method. The is used to print a concise summary of a DataFrame, including:

→ the number of non-null values in each column,
→ the data type of each column, and
→ the memory usage of the DataFrame.

This method can be used to quickly assess the shape and structure of a DataFrame, as well as identify potential issues such as missing values or incorrect data types.

Hey @mubashir_rizvi , summary statistics is what we all crave for, and you can also get it by using following method:

If you are not able to get the flow of code, share it with me. I would love to explain it.