This Article is yet to be approved by a Coordinator.
Regression Metrics
In regression analysis, several metrics are used to evaluate the performance of a model. Below are the most commonly used regression metrics, with clear explanations and mathematical formulae:
1. Mean Absolute Error (MAE)
- Definition: MAE measures the average of the absolute differences between actual and predicted values.
- Formula:
[
\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} \left| y_i - \hat{y_i} \right|
]
- Interpretation: MAE gives an easy-to-interpret measure of average error, with no emphasis on the direction of the error.
2. Mean Squared Error (MSE)
- Definition: MSE computes the average of the squared differences between actual and predicted values, penalizing larger errors more.
- Formula:
[
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2
]
- Interpretation: Larger errors have a higher penalty due to squaring, making MSE sensitive to outliers.
3. Root Mean Squared Error (RMSE)
- Definition: RMSE is the square root of the MSE, which allows the error to be interpreted in the same unit as the target variable.
- Formula:
[
\text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2}
]
- Interpretation: RMSE is useful for understanding the magnitude of error in the same scale as the original data.
4. R-squared (R²)
- Definition: Also known as the coefficient of determination, R² measures the proportion of variance in the dependent variable that is explained by the model.
- Formula:
[
R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y_i})^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}
]
- Interpretation:
- ( R^2 = 1 ): Perfect fit.
- ( R^2 = 0 ): The model does not explain any variance.
- Negative values indicate that the model performs worse than a simple mean-based model.
5. Adjusted R-squared
-
Definition: Adjusted R² modifies the R² value to account for the number of predictors in the model, penalizing the model for unnecessary complexity.
-
Formula:
[
\text{Adjusted } R^2 = 1 - \frac{(1 - R^2)(n - 1)}{n - p - 1}
]
where:
- ( n ) = number of data points
- ( p ) = number of predictors
-
Interpretation: Unlike R², Adjusted R² helps to compare models with a different number of predictors and prevents overfitting.
6. Mean Absolute Percentage Error (MAPE)
- Definition: MAPE measures prediction accuracy as a percentage of the actual values.
- Formula:
[
\text{MAPE} = \frac{100}{n} \sum_{i=1}^{n} \left| \frac{y_i - \hat{y_i}}{y_i} \right|
]
- Interpretation: MAPE provides an easy-to-understand percentage error, but it can be undefined if any actual value ( y_i = 0 ).
7. Explained Variance Score
- Definition: This score measures how much of the variance in the target variable is explained by the model.
- Formula:
[
\text{Explained Variance} = 1 - \frac{\text{Var}(y - \hat{y})}{\text{Var}(y)}
]
- Interpretation: A higher score indicates that the model explains more variance in the target variable. This metric is similar to R² but differs slightly in its handling of extreme values.