- Joined
- Jun 27, 2017
- Professional Status
- Certified General Appraiser
- State
- California
Multivariable regression has several limitations, including:
It's important to be aware of these limitations when using multivariable regression and to consider alternative modeling approaches when these limitations may impact the validity of the results.
- Assumption of Linearity: Multivariable regression assumes a linear relationship between the independent variables and the dependent variable. If this assumption is not met, the model may not be accurate.
- Assumption of Independence: Multivariable regression assumes that the independent variables are independent of each other. If they are correlated, it can lead to issues such as multicollinearity, which can affect the model's accuracy and interpretation.
- Overfitting: Including too many independent variables in the model can lead to overfitting, where the model performs well on the data used to create it but poorly on new data.
- Interpretability: As the number of independent variables increases, it becomes more difficult to interpret the individual impact of each variable on the dependent variable.
- Sample Size: Multivariable regression requires a relatively large sample size to produce reliable results. With a small sample, the model may not be accurate.
- Outliers and Influential Points: Multivariable regression can be sensitive to outliers and influential points, which can disproportionately affect the results.
- Assumption of Normality and Homoscedasticity: Multivariable regression assumes that the errors are normally distributed and have constant variance. Violation of these assumptions can affect the validity of the model.
QUESTION; Should the above be included in your report certifications and Limitations Page?
1. Multi-Variable regression is just regression that is based on more than one variable. So, all appraisal regression is pretty much multi-variable regression.
2. Multi-Linear regression is contraining the regression function for each variable to a line. This greatly reduces the accuracy of regression. For example, GLA usually goes up fast in the lower ranges then levels off and maybe starts going downward for large homes - which in many areas are not desired.
3. I hope you are aware of the fact that I NEVER use multi-linear regression. IT DOES NOT WORK in the SF Bay Area.
4. What I use is non-linear regression. I use MARS - which is segmented linear that break or changes direction at "knots" that actually split the variable range into categories. So, for example the value contribution of bathroom count may go up from 1-3 bathrooms, but then start to drop off at the 4th bathroom.
5. Segemented linear gives about the same R2 or higher as curvilinear regression. It has the advantage that it is faster to calculate - and is easier to use in practice.