What is the difference between **standardizing** the data or **normalizing** it? Can someone provide me examples of scenarios where one method is preferred over another?

**Standardization** is used to transform variables/attributes with a Gaussian Distribution and differing means and standard deviations to a standard Gaussian distribution with a mean of 0 and a standard deviation of 1.

This approach is suitable for techniques that assume a Gaussian distribution in the input variables and work better with rescaled data, such as linear regression, logistic regression and linear discriminate analysis.

**Normalizing** refers to rescaling each observation (row) to have a length of 1 (called a unit norm in linear algebra).

What is the difference between the patterns between the sets of numbers (0, 5, 10) and (0, 0.5, 1)? None.

Both have the same spread. So if we wanted to learn a pattern in this dataset either would serve the purpose, but one of them has larger values.

Larger values mean bigger computations and in the case of neural networks, exploding gradients.

The second set is a normalized version of the first one. We have normalized the first one to retain the pattern and discard unnecessary information.

Normalization rescales the values into a range of [0,1]. This might be useful in some cases where all parameters need to have the same positive scale. However, the outliers from the data set are lost.

While standardization refers to converting the variable such that it has a zero mean and unit variance. One of the most common assumptions in statistics is that the data follows a Guassian/normal distribution and to make inferences using normal distribution we often look at the standard normal distribution which has zero mean and unit variance. So standardization is used to rescale data points into a standard normal distribution