Using OrdinalEncoder instead of OneHotEncoder for tree based models

When working with tree-based models in scikit-learn, it can be beneficial to use OrdinalEncoder instead of OneHotEncoder for encoding categorical features.

The OrdinalEncoder is a transformer that encodes categorical features as ordinal integers. This can be useful when working with tree-based models because it preserves the natural ordering of the categories.
In contrast, OneHotEncoder creates a binary feature for each category, resulting in a larger feature space. This can be problematic for tree-based models, which can overfit to high-dimensional feature spaces.

Here’s an example of using OrdinalEncoder instead of OneHotEncoder:

Note that evaluating the tree on the training data may not be a good indicator of the model’s performance on new, unseen data. It’s important to evaluate the model on a separate testing set to get a more accurate estimate of its performance.