We will learn how to use ColumnTransformer in Scikit-Learn Python in this discussion.
class sklearn.compose.ColumnTransformer(transformers , *** , remainder=‘drop’ , sparse_threshold=0.3 , n_jobs=None , transformer_weights=None , verbose=False , verbose_feature_names_out=True )
Applies transformers to columns of an array or Pandas DataFrame.
This estimator allows different columns or column subsets of the input to be transformed separately, and the features generated by each transformer will be concatenated to form a single feature space. This is useful for heterogeneous or columnar data because it allows you to combine several feature extraction mechanisms or transformations into a single transformer.
Example
To get the feature names output by a ColumnTransformer
, you can access the named_transformers_
attribute of the ColumnTransformer
object. This attribute returns a dictionary of transformers, where the keys are the names or indices of the transformers and the values are the transformer objects themselves.
For each transformer, you can access its feature names by calling the get_feature_names_out
method. This method returns an array of feature names for the transformer’s output.
Here’s an example code snippet that demonstrates how to get the feature names output by a ColumnTransformer
:
In this example, we first define a ColumnTransformer
object that applies a StandardScaler
transformer to columns 0 and 1 and a OneHotEncoder
transformer to column 2 of the input data.
StandardScaler
is used to scale the first two numeric columns, and OneHotEncoder
is used to one-hot encode the last two categorical columns. X_transformed
stores the transformed data, and get_feature_names_out
is used to retrieve the feature names of the transformed data.