Skip to content

Files

Latest commit

 

History

History

Feature scaling


  • Feature scaling is a method used to normalize the range of independent variables or features of data.
  • In data processing, it is also known as data normalization and is generally performed during the data preprocessing step.
  • Since the range of values of raw data varies widely, in some machine learning algorithms, objective functions will not work properly without normalization.
  • For example, many classifiers calculate the distance between two points by the Euclidean distance.
  • If one of the features has a broad range of values, the distance will be governed by this particular feature.
  • Therefore, the range of all features should be normalized so that each feature contributes approximately proportionately to the final distance. Another reason why feature scaling is applied is that gradient descent converges much faster with feature scaling than without it

Before Scaling

  1. Orginal Data Set Distribution

After Scaling

  1. Standard Scaler
  2. Min_Max_scaler
  3. Robust Scaler
  4. MaxAbsScaler
  5. PowerTransformer(Box Cox)
  6. PowerTransformer(Yeo-Johnson)
  7. Quantile_transform_uniform
  8. Quantile_transform_Gaussian
  9. Normalizer

Reference

  1. Wiki

On What Data We Have to apply What type of Scaling: