Purity in Decision Trees: Gini Vs Entropy

toobamukhtar · February 25, 2019, 6:39pm

For determining the purity of a node in decision trees, which one is a better metric? Gini or Entropy? Also, are there cases where one should be preferred over the other?

Arslan97 · March 1, 2019, 10:26pm

Gini impurity and entropy are pretty much the same thing. They’re often used interchangeably. The reason for this is that mathematically they are quiet similar

For example in the discrete case
Entropy is
1_nNY_7_aWRwp8E2DyGduEPg

where as the Gini is
gini

Except for a constant factor of one they are both weighted sums of relative frequencies. So to determine the purity of a node they both should give a similar answer.

They are however, scenarios where one would use the Gini coefficient instead of entropy as, the gini doesn’t require you to take logs; which can save you time in terms of calculations.[color=#]Preformatted text[/color]