cover photo

BLOG · 2/10/2024

Regression Metrics

lekha dh
lekha dh
OP
Regression Metrics
This Article is yet to be approved by a Coordinator.

Regression Metrics

In regression analysis, several metrics are used to evaluate the performance of a model. Below are the most commonly used regression metrics, with clear explanations and mathematical formulae:

1. Mean Absolute Error (MAE)

  • Definition: MAE measures the average of the absolute differences between actual and predicted values.
  • Formula: [ \text{MAE} = \frac{1}{n} \sum_{i=1}^{n} \left| y_i - \hat{y_i} \right| ]
  • Interpretation: MAE gives an easy-to-interpret measure of average error, with no emphasis on the direction of the error.

2. Mean Squared Error (MSE)

  • Definition: MSE computes the average of the squared differences between actual and predicted values, penalizing larger errors more.
  • Formula: [ \text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2 ]
  • Interpretation: Larger errors have a higher penalty due to squaring, making MSE sensitive to outliers.

3. Root Mean Squared Error (RMSE)

  • Definition: RMSE is the square root of the MSE, which allows the error to be interpreted in the same unit as the target variable.
  • Formula: [ \text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2} ]
  • Interpretation: RMSE is useful for understanding the magnitude of error in the same scale as the original data.

4. R-squared (R²)

  • Definition: Also known as the coefficient of determination, R² measures the proportion of variance in the dependent variable that is explained by the model.
  • Formula: [ R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y_i})^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2} ]
  • Interpretation:
    • ( R^2 = 1 ): Perfect fit.
    • ( R^2 = 0 ): The model does not explain any variance.
    • Negative values indicate that the model performs worse than a simple mean-based model.

5. Adjusted R-squared

  • Definition: Adjusted R² modifies the R² value to account for the number of predictors in the model, penalizing the model for unnecessary complexity.

  • Formula: [ \text{Adjusted } R^2 = 1 - \frac{(1 - R^2)(n - 1)}{n - p - 1} ] where:

    • ( n ) = number of data points
    • ( p ) = number of predictors
  • Interpretation: Unlike R², Adjusted R² helps to compare models with a different number of predictors and prevents overfitting.


6. Mean Absolute Percentage Error (MAPE)

  • Definition: MAPE measures prediction accuracy as a percentage of the actual values.
  • Formula: [ \text{MAPE} = \frac{100}{n} \sum_{i=1}^{n} \left| \frac{y_i - \hat{y_i}}{y_i} \right| ]
  • Interpretation: MAPE provides an easy-to-understand percentage error, but it can be undefined if any actual value ( y_i = 0 ).

7. Explained Variance Score

  • Definition: This score measures how much of the variance in the target variable is explained by the model.
  • Formula: [ \text{Explained Variance} = 1 - \frac{\text{Var}(y - \hat{y})}{\text{Var}(y)} ]
  • Interpretation: A higher score indicates that the model explains more variance in the target variable. This metric is similar to R² but differs slightly in its handling of extreme values.

UVCE,
K. R Circle,
Bengaluru 01