• Welcome to AppraisersForum.com, the premier online  community for the discussion of real estate appraisal. Register a free account to be able to post and unlock additional forums and features.

Appraisal Statistics

Status
Not open for further replies.
...Cause and effect are good to know - but I think we can assume if we create a good fitting model based on all 20+ sales over the past year, and another potential sale comes along - it will probably fall into the same pattern, regardless of the cause...
Yes but...
(I'm working off memory here so please forgive details and correct if anyone remembers more betterer...Austin/Santora debates) I recall some posts a few years ago that referenced a university regression study that assigned like $10,000 to screen doors. The model fit well...the cause and effect was way off...so it's more than important IMO.
 
Last edited:
Yes but...
(I'm working off memory here so please forgive details and correct if anyone remembers more betterer...Austin/Santora debates) I recall some posts a few years ago that referenced a university regression study that assigned like $10,000 to screen doors. The model fit well...the cause and effect was way off...so it's more than important IMO.

That sounds like overfitting. You can take something like MARS and configure it to zigzag up and down to attempt to fit the model to every single point. And if that house with the screen door is on the high end and the only feature MARS can find to weight is that screen door - it will add it into the model - which when you try to apply it to a new sale not in your input data, will fail miserably. So, you always have to review the model to make sure you are not overfitting and that it makes sense. That is the purpose of MARS - to generate models that make sense and are understandable. To remedy this problem in MARS, you can set the "minimum observations between knots to be something like 3 rather than 1. So, it takes at least 3 sales to create a new regression segment in your model. There is also a penalty for adding new basis functions, and so on.

But MARS will give you a model (after you interpret the formula for a user) that is something like this:
GLA: 600-1100 sf: $300/sf
1101sf-1800sf: $250/sf
1801sf-2500sf: $200/sf
2501sf-3200sf: $100/sf
3201sf+: $0/sf
Bathrooms: 1-2: $15000/bath
3-4: $10000/bath
5+: $0/bath
and so on.

Because you have broken up typical linear regressions into segment linear you get a much closer fit - which results in tighter adjusted comps.

More interesting are those areas where these models are the only thing that can get you out the woods. Especially when things get complex, Carmel, CA comes to mind.
 
Gotcha on your objective for the "statistics forum" and agree with your thinking, but it's not a functional open-forum topic. Maybe a forum open to reading but closed to posting except invitation-only would fit your concept better.

Why do I say this? You see how easily threads are hijacked here (and on every board like this), people can't resist making distracting segues into jokes, slightly off-topic replies, etc. and you can only sustain a cohesive line of logic a short time before a thread disintegrates. This concept is deep - really DEEP. Everything is math - there's little judgement involved and appraisal professionals are used to relying heavily on - and expressing in terms of - judgement. Keeping a thought on track here long enough to resolve a complex issue is like, as we say, "Herding cats."

The theory needs to be developed like scientific theory - publishing brief but complete manuscripts, and then defending challenges or making adjustments for the next article. Repeat. A forum like this is too informal.

That's why I say it won't work (in my opinion) although I am as eager as you are to see the theories and processes advanced.

As for the Appraisal Institute, I agree so much with you that the AI courses need to be re-thought... maybe individual instructors have rounded their courses out, since I'm sure that they want to provide relevant education, but as you know, we need a lot more than a course to make econometrics useful and credible. Maybe the people who authorize courses didn't actually understand this topic's course content well enough, and it was put into practice too soon. Or, maybe it is going according to plan - baby steps; we don't know the overall plan if there is one. So, that may be a place to begin - get involved in an educational platform like that and improve the system. Publish... just not only here if you want progress.

It's better than it used to be. You'd think maybe the Appraisal Institute forum ... but they are so sparse ... empty, dead! This is an active forum, despite the noise, there's something to be said for it. If you had a separate sub-forum, a lot of members not interested in statistics, would likelyy just skip over it. Not so under "General".
 
That sounds like overfitting. You can take something like MARS and configure it to zigzag up and down to attempt to fit the model to every single point. And if that house with the screen door is on the high end and the only feature MARS can find to weight is that screen door - it will add it into the model - which when you try to apply it to a new sale not in your input data, will fail miserably. So, you always have to review the model to make sure you are not overfitting and that it makes sense. That is the purpose of MARS - to generate models that make sense and are understandable. To remedy this problem in MARS, you can set the "minimum observations between knots to be something like 3 rather than 1. So, it takes at least 3 sales to create a new regression segment in your model. There is also a penalty for adding new basis functions, and so on.

But MARS will give you a model (after you interpret the formula for a user) that is something like this:
GLA: 600-1100 sf: $300/sf
1101sf-1800sf: $250/sf
1801sf-2500sf: $200/sf
2501sf-3200sf: $100/sf
3201sf+: $0/sf
Bathrooms: 1-2: $15000/bath
3-4: $10000/bath
5+: $0/bath
and so on.

Because you have broken up typical linear regressions into segment linear you get a much closer fit - which results in tighter adjusted comps.

More interesting are those areas where these models are the only thing that can get you out the woods. Especially when things get complex, Carmel, CA comes to mind.


You have totally lost me on your analysis. The more complex a property becomes, the less reliable regression analysis becomes. The more heterogeneous a market becomes, the less reliable regression analysis becomes.
 
referenced a university regression study that assigned like $10,000 to screen doors
When that happens it is telling you something. Namely that screen doors are having zero impact on value. Drop that variable.
 
That sounds like overfitting. You can take something like MARS and configure it to zigzag up and down to attempt to fit the model to every single point. And if that house with the screen door is on the high end and the only feature MARS can find to weight is that screen door - it will add it into the model - which when you try to apply it to a new sale not in your input data, will fail miserably. So, you always have to review the model to make sure you are not overfitting and that it makes sense. That is the purpose of MARS - to generate models that make sense and are understandable. To remedy this problem in MARS, you can set the "minimum observations between knots to be something like 3 rather than 1. So, it takes at least 3 sales to create a new regression segment in your model. There is also a penalty for adding new basis functions, and so on.

But MARS will give you a model (after you interpret the formula for a user) that is something like this:
GLA: 600-1100 sf: $300/sf
1101sf-1800sf: $250/sf
1801sf-2500sf: $200/sf
2501sf-3200sf: $100/sf
3201sf+: $0/sf
Bathrooms: 1-2: $15000/bath
3-4: $10000/bath
5+: $0/bath
and so on.

Because you have broken up typical linear regressions into segment linear you get a much closer fit - which results in tighter adjusted comps.

More interesting are those areas where these models are the only thing that can get you out the woods. Especially when things get complex, Carmel, CA comes to mind.

That's what I was referring to earlier. Segmented, step-wise...thought it was the same concept. Can this be easily done in R do you know? Considering playing with R because I'm not going to pay for MARS.

Why are we talking about VA delinquency rates?
 
You have totally lost me on your analysis. The more complex a property becomes, the less reliable regression analysis becomes. The more heterogeneous a market becomes, the less reliable regression analysis becomes.

You are right, in general. I didn't say different. However, that is not saying that regression is not useful in heterogeneous or complex markets. In fact it can allow you to see patterns that would otherwise be very difficult to see because of all the noise. And pertaining to other posts in this thread, I would venture to guess what is at work here are the brokers - whom I've talked to over the years (leastways going back 10+ years). The brokers too are overwhelmed by the complexity and have developed rules of thumb: e.g. backyard cottages go for the same value/sf as the residence. And, that made sense, because owners could and did rent these out or use them for children, in-laws or friends who needed a place to stay. In fact, they are often worth more. Also, in a place like Carmel, people often come in with cash, talk down the price if they can, otherwise pretty much pay for what is asked. So, the brokers and sellers probably have more influence on setting the final sales price than in other areas. In fact, the sellers often take the advice of their brokers. Yet despite all the superficial complexity you can find some general rules underneath the surface that you may or may not be able to confirm from your interviews with brokers (who unfortunately often hide their pricing methods). And vice versa, if a broker tells you he figures price in such and such a way - your regression can serve as verification, one way or another.

Regression is a tool to:
1. Find patterns of buying behavior.
2. Find patterns of broker/seller pricing behavior.
3. Verify rules of pricing you pick up in interviews with sellers and buyers - or in fact appraisers.
4. Don't forget the role of appraisers in influencing prices - they can lower an offer to get a loan. And of course appraises have their own valuation patterns/rules.

Again, with respect to some other posts in this thread. We usually appraise for market value - because that is what is asked. How that value is used for deciding LTV or collateral risk is not really our concern; although many, including myself think that we should be a lot more concerned with the value of a property as long range collateral for a loan, rather than just current market value. A whole different subject: Long range value means you have to look at long term trends, filter out current market tastes, calculate in the influence of loan rates, predict loan rates and a whole lot of other stuff - as if we didn't already have enough on our plate.
 
Again, with respect to some other posts in this thread. We usually appraiser for market value - because that is what is asked. How that value is used for deciding LTV is not really our concern; although many, including myself think that we should be a lot more concerned with the value of a property as long range collateral for a loan, rather than just current market value. A whole different subject: Long range value means you have to look at long terms trends, filter out current market tastes, calculate in the influence of loan rates, predict loan rates and a whole lot of other stuff - as if we didn't already have enough on our plate.
Isn't that what underwriters do? That is scary scope-creep.
 
That's what I was referring to earlier. Segmented, step-wise...thought it was the same concept. Can this be easily done in R do you know? Considering playing with R because I'm not going to pay for MARS.

Why are we talking about VA delinquency rates?


OK, the question is "Why do we have to pay for Salford Systems MARS? R must have something equivalent."

I really don't have any relationship to the company, which is now owned by Minitab. This is the thing: Salford Systems products have been developed over many years and are highly optimized. The source is in C++ and, as it is written, it is highly optimized. Steps in the original algorithm, which are understandable, have been replaced by lots of intermediate caching, deltas stacked on top of each other. If the Chinese or whoever, "decompile" the executable code, what they will get is source code with all the variable names changed into nonsense names. Because of that and because of all the optimization, the code they get is rubbish without someone to explain what it all means. In other words, optimization divorces the code from the underlying algorithm. So, the Chinese cannot replicate MARS, CART or the other highly optimized programs. They could try to create new algorithms and a new code base to compete with Salford Systems - but that would take many years. In particular the CART and MARS programs have been under development since the 1980s and have been improved over the past 30+ years.

So, in my opinion, based on my experience, there is one and only one usable MARS program - the one written by Salford Systems. If someone else can find something that comes close, please let us all know.

Now, an added argument, is that the DMA modeling competition has been won by one company for the past 10 years: Data Labs USA. If you look on their Careers section where they advertise for developers you can discover what they use:

1. R Language
2. SAS
3. Salford Systems CART and TreeNet (MARS is an extension of CART).
4. Microsoft .Net, C#, SQL Server

In particular - they don't use Python - for those PANDA worshipers out there.
 
Status
Not open for further replies.
Find a Real Estate Appraiser - Enter Zip Code

Copyright © 2000-, AppraisersForum.com, All Rights Reserved
AppraisersForum.com is proudly hosted by the folks at
AppraiserSites.com
Back
Top