I’m working with a large Pandas dataframe
and I need to quickly get an overview of its contents, such as the number of rows and columns, and the data types of each column. I’m not sure if there are functions or methods available to do this, if there are please let me know about them. You can also provide an example code and you can use the following sample dataframe
:
Hi @mubashir_rizvi, this may help you:
- The
shape
attribute returns a tuple containing the number of rows and columns. - We’ve accessed the number of rows using
[0]
since it is at the first position in the tuple. - The number of columns is accessed using
[1]
.
- The
len()
is a built-in function that returns the number of items in an object. - We can use the
len()
function on the dataframe object itself to get the total number of rows in the dataframe. - And to get the number of columns, we use this function on
df.columns
which is a list containing all column names.
Hey @mubashir_rizvi, you can use these two methods:
- The
dtypes
is an attribute used on Pandas dataframe object and it returns the data type for each column in the dataframe.
- The
df.info()
method provides a concise summary of a dataframe. It displays information about the dataframe, including the number of rows and columns, the data types of the columns, the number of non-null values in each column, and the amount of memory used by the dataframe. - We can use this method to find the data types of each column.
Hello @mubashir_rizvi, you can achieve your task by using the applymap()
method.
In the above code, we have passed the type
function to applymap()
and since the datatype for each element of a particular column would be the same, we have used loc[0]
to get the first row of the result which is sufficient for knowing the datatypes for each column.