Retrieving feature names by a ColumnTransformer

We will learn how to use ColumnTransformer in Scikit-Learn Python in this discussion.

class sklearn.compose.ColumnTransformer(transformers , *** , remainder=‘drop’ , sparse_threshold=0.3 , n_jobs=None , transformer_weights=None , verbose=False , verbose_feature_names_out=True )

Applies transformers to columns of an array or Pandas DataFrame.

This estimator allows different columns or column subsets of the input to be transformed separately, and the features generated by each transformer will be concatenated to form a single feature space. This is useful for heterogeneous or columnar data because it allows you to combine several feature extraction mechanisms or transformations into a single transformer.

Example

To get the feature names output by a ColumnTransformer, you can access the named_transformers_ attribute of the ColumnTransformer object. This attribute returns a dictionary of transformers, where the keys are the names or indices of the transformers and the values are the transformer objects themselves.

For each transformer, you can access its feature names by calling the get_feature_names_out method. This method returns an array of feature names for the transformer’s output.

Here’s an example code snippet that demonstrates how to get the feature names output by a ColumnTransformer:

In this example, we first define a ColumnTransformer object that applies a StandardScaler transformer to columns 0 and 1 and a OneHotEncoder transformer to column 2 of the input data.

StandardScaler is used to scale the first two numeric columns, and OneHotEncoder is used to one-hot encode the last two categorical columns. X_transformed stores the transformed data, and get_feature_names_out is used to retrieve the feature names of the transformed data.