• Welcome to AppraisersForum.com, the premier online  community for the discussion of real estate appraisal. Register a free account to be able to post and unlock additional forums and features.

Appraisal Statistics: Regression - Do Your Really Know How Much Work It Is?

Status
Not open for further replies.

RCA

Elite Member
Gold Supporting Member
Joined
Jun 27, 2017
Professional Status
Certified General Appraiser
State
California
A couple of weeks ago, an AI instructor told the me that you have to put all comparables for your subject in the sales adjustment grid. I asked if she meant all properties the adjustments are based on and she said "Yes." I'm not sure though that that is really what she meant.

But if so,, that's a big deal if that's what reviewers really think.

Typically I would run regression on all the neighborhood sales within a certain GLA sq. ft. range of the subject to extract patterns and the impact of various features on price. I would then select from these the 6 most comparable for the grid. I actually think that is the way it has to go. You could add maybe another 6 comps; but that is going to be more work and more "exposure" and probably affect things for the worse. So, I don't think that is really an issue. In any case, I wouldn't myself use more than 12 comps.

More importantly:

1. Regression only models the attributes that you input - and only if there are a certain number of non-null (known) values for each attribute. So, typically we do not get certain values like Condition, Quality and View from the MLS as quantities. You may get these indications inexactly. For example, view may be listed as categories: "Ocean, Mountains, Neighborhood, ..". In the latter case you just need to go through all the data and make sure they are consistent. Before running the model, you have to check the attribute/variable as "categorical". MARS will put a base function in your model something like BF8 = ( VIEW$ in ("Ocean" )) and then the adjustment would be value = BF8 * 20000 + ..... or BF9= (CONDITION$ in ("C-1", "C-2")) and value = BF9 * 30000 + .....

2. Condition and Quality, or similar important variables, should be coded with numbers based on your best guess from what you know about the neighborhood, the MLS and street view.

3. Then you run your regression to create a model.

4. Next you run the model against the input data to predict the prices of comps.

5. Then you calculate the difference between the model prediction and the actual sale price. If your model is good, the difference should be small. If it is not then, look at the sales with large differences and see if you can determine why there is such a large difference.

6. Make improvements and iterate until you can't do any better.

7. Then choose the comps you want to put in the grid. How many? Well that seems open to debate.

In any case, I think you can see, it can be a lot of work to add in the variable values that may be missing from your MLS data and then tweak the model by seeing if you can improve your supplied (most likely subjective) values. Also, you should be able to understand why AVMs are so far off - as they don't know the values of all the important contributors to value that are typically not in the MLS or at least cannot be estimated very well from the data in the MLS. [ But interestingly, if the AVM companies can get some of that data, they can use it to incrementally improve their models - thus the interest in getting their hands on appraiser data.]
 
I see a lot of problems with the process you laid out above.

Not now, but maybe next week when I have more time, may I PM you about this?
 
Interesting that you choose your comps after viewing the regression. I don't do a regression until after making comp selection, however I see how it could be beneficial in drawing out the statistically "better" comparables. I suppose that I try and locate comparables nearest to the subject and most recently sold, which might not always be the most central choices looking at a regression analysis.
 
Interesting that you choose your comps after viewing the regression. I don't do a regression until after making comp selection, however I see how it could be beneficial in drawing out the statistically "better" comparables. I suppose that I try and locate comparables nearest to the subject and most recently sold, which might not always be the most central choices looking at a regression analysis.

1. It doesn't matter whether you choose your comps before or after the regression. How could you prove that you did it before or after? No proof - not an issue. Also, one could add that regression indicates which attributes are important for bracketing - and that is a reason to do regression first.
2. Usually you choose the comps that bracket different important features, if they are available. You want to tell a story, provide an explanation with the comps. Is there a real purpose to showing comps that are essentially duplicates of other comps, when you can weight the comps?
3. Proximity to the subject and date of sale are only two attributes of comparables. There may be more important ones. In fact if there is no adjustment for market conditions, date of sale may not be important at all as long as it is within a given period (e.g. 1 year) and relative proximity is often not nearly as important as other features. OMG. You ought to see my street!!! GLAs, views, lot sizes, sale prices jump all over the place as you walk down the street, one house to the next. Each house a different story.

I've appraised homes in 17 counties in Northern California. I am slow but detailed in my work. I give it thought. I learn more in doing one appraisal than many other appraisers do in a lifetime (so I like to think). I hope you learn to do the same.
 
Last edited:
I see a lot of problems with the process you laid out above.

Not now, but maybe next week when I have more time, may I PM you about this?

There is no need to PM, if you are sure of yourself. That is to say, I'd like to hear your most important objections and/or questions. I can provide clarification.
 
Last edited:
What you posted is backwards 1) choosing comps after regression, which is NOT how a buyer chooses an alternate property for the subject and the "comps " may thus not be the best substitutes for the subject, (on what basis do you choose the Comps from a regression analysis?)

Then tweaking the regression to get the values for each comp close to the original sales price.. what does that mean....the purpose of making an adjustment is to bring the comparable closer to the physical characteristics to the subject to derive a value for subject

If you are using an RA to bring adjusted values as close as possible to original sale price, why do it at all? Just keep the original sales prices, make no adjustments and compare the properties qualitatively.

Purpose of RA is to derive adjustments and apply them to the comps you have already chosen- chosen for the reasons a buyer would

A statically similar sale may or may not be a good comp, and sometimes a sale is needed as a comp because it has a high value similarity with subject such as an ocean view. .
 
Last edited:
How many data points per item do you have, typically, for your regression analysis?
 
5. Then you calculate the difference between the model prediction and the actual sale price. If your model is good, the difference should be small. If it is not then, look at the sales with large differences and see if you can determine why there is such a large difference.


Are you appraising to hit the sales price? What does the above mean.
 
5. Then you calculate the difference between the model prediction and the actual sale price. If your model is good, the difference should be small. If it is not then, look at the sales with large differences and see if you can determine why there is such a large difference.


Are you appraising to hit the sales price? What does the above mean.

Regression is done only on past sales, sales that have already transpired and does not know about the sale price of the subject, if it is pending sale. With respect to past sales, all regression methods attempt to use the past sale prices as the "target" or "dependent" variable, with the other attributes/variables used as independent/predictor variables. And the methods invariably attempt to reduce the "Sum of Squared Errors" (SSE) to the smallest possible value, where each error is the difference between the sale price predicted from the current model minus the actual sale price. So you could say "regression is to sale price", - but to be specific, past sale prices. Then, just as the regression model is designed to do a good job of replicating past sales prices based on the inputs, it is used to predict the current sale price. But more specifically, we used the parts of the model that equate to adjustments to property features, to adjust then our chosen comps and average those values, possibly with weighting, to provide a most likely sale price for the subject.


One could argue that with regression, we might as well drop the comps and just use the model. However, the comps "demonstrate" how good the model is by applying its recommended adjustments to comps with varying values on important features indicated by the regression model. We could have two values: (1) The value obtained by adjusting comp sales prices and (2) the value predicted directly from the model. I think it would be good to provide both, especially since the data going into the regression analysis is usually not complete nor 100% accurate. Running the output of the regression through the comp grid is a way to make up for the failings of the regression method (primarily lack of good quantifiable data). Using regression gives us, on the other hand, an unbiased way to determine adjustments that should work fairly well over a large number of similar properties in the subject neighborhood or market area.
 
Status
Not open for further replies.
Find a Real Estate Appraiser - Enter Zip Code

Copyright © 2000-, AppraisersForum.com, All Rights Reserved
AppraisersForum.com is proudly hosted by the folks at
AppraiserSites.com
Back
Top