In this thread, we will discuss various techniques for merging or combining multiple series simultaneously to create a DataFrame. If you are new to the concept of series, you can refer to the Building Pandas series with several datatypes thread. A DataFrame is a two-dimensional labeled data structure that comprises columns with potentially varying data types. It resembles a spreadsheet or SQL table, where each column can hold different data types such as integers, floats, strings, etc., and each row represents an observation or record.
There are a few techniques through which you can combine many series into a single data frame, here are some of them:
1. Using "pd.concat()" function:
The pd.concat()
function is useful for concatenating or joining two or more Pandas objects along a particular axis. You can specify:
-
axis = 0
oraxis = index
to concatenate objects row-wise. -
axis = 1
oraxis = column
to concatenate objects column-wise.
The sample code below concatenates 3 different series column-wise.
Note: After the series are joined in the data frame, they are named by themselves as 0
, 1
, and 2
respectively.
2. Using "pd.DataFrame()" constructor:
- The
pd.DataFrame()
is a function that creates a new DataFrame object and can take various inputs, including lists, arrays, dictionaries, etc. - In the example code below, 3 series are passed enclosed in a list to this function to create a DataFrame but you can pass as many as you want.
Note: Remember that with this method, the series are combined row-wise and not column-wise.
3. Passing a dictionary to "pd.DataFrame()" function:
- Since
pd.DataFrame()
can also take a dictionary as an argument to create a DataFrame, in this method we created a dictionary with values as the series and keys as the names of the columns we want and passed it to this function. - The biggest advantage of this technique is that it joins the series column-wise and you can rename each column as you like by specifying keys in the dictionary.