- Joined
- Jun 27, 2017
- Professional Status
- Certified General Appraiser
- State
- California
I've written a new article and in my "factory" for publish articles, I put it first on Zenodo as a near final draft - eventually to put the final version in my new Journal "Valuation Engineer Journal" on https:/journal.valuation-engineer.com
doi.org
What I am attempting to do here is explain how Earth (aka MARS) works for appraisers with upper-division knowledge of mathematics, primarily Linear Algebra and Statistics.
I expose the internals of earth() in pseudo code with explanations. I also try to predict answers to questions that might arise. Unlike academic articles that assume advanced knowledge of math, I try to explain many things that wouldn't otherwise be explained. Nonetheless, you will encounter terms such as Householder and Givens Transformations, among others. Try not to get put off by these terms or to think you have to understand them completely.
What you should try to understand is how the algorithm works. Roughly:
1. It transforms your spreadsheet rows of sales transactions with columns of features (such as GLA, bath count, bedroom count, age, lot_size, ...) where many of the features are correlated or collinear into basis functions such as max(0, GLA-2000), then it regresses on these basis functions to find the coefficients of the basis functions, but first it transforms them to independent variables (an orthogonal or perpenicular set of coordinates), regresses on those and when it finishes transforms them back to basis functions and then outputs the model in terms of basis functions. It does this with a forward pass and a backward pruning pass. In the forward path, it increments the basis function by each value in the sorted array of feature values (e.g., across all GLA values in your spreadsheet) and selects the value that minimizes the RSS (residual sum of squares).
My review panel for these articles is Anthropic Claude - much cheaper than paying people who know far less to do the review. But if any of you have the courage or wherewithal to trudge through the paper and have questions, send me an email, and I'll try to answer them. I also have papers coming on glmnet() and mgcv(). But earth() is the most important
earth for Appraisers
What I am attempting to do here is explain how Earth (aka MARS) works for appraisers with upper-division knowledge of mathematics, primarily Linear Algebra and Statistics.
I expose the internals of earth() in pseudo code with explanations. I also try to predict answers to questions that might arise. Unlike academic articles that assume advanced knowledge of math, I try to explain many things that wouldn't otherwise be explained. Nonetheless, you will encounter terms such as Householder and Givens Transformations, among others. Try not to get put off by these terms or to think you have to understand them completely.
What you should try to understand is how the algorithm works. Roughly:
1. It transforms your spreadsheet rows of sales transactions with columns of features (such as GLA, bath count, bedroom count, age, lot_size, ...) where many of the features are correlated or collinear into basis functions such as max(0, GLA-2000), then it regresses on these basis functions to find the coefficients of the basis functions, but first it transforms them to independent variables (an orthogonal or perpenicular set of coordinates), regresses on those and when it finishes transforms them back to basis functions and then outputs the model in terms of basis functions. It does this with a forward pass and a backward pruning pass. In the forward path, it increments the basis function by each value in the sorted array of feature values (e.g., across all GLA values in your spreadsheet) and selects the value that minimizes the RSS (residual sum of squares).
My review panel for these articles is Anthropic Claude - much cheaper than paying people who know far less to do the review. But if any of you have the courage or wherewithal to trudge through the paper and have questions, send me an email, and I'll try to answer them. I also have papers coming on glmnet() and mgcv(). But earth() is the most important