• Welcome to AppraisersForum.com, the premier online  community for the discussion of real estate appraisal. Register a free account to be able to post and unlock additional forums and features.

Forum Sponsor - a la mode

QuickSource provides a single-source solution to easily import, compare, and manage data from multiple, credible sources in every report. See what the next game-changer is really all about.

Moving from R to Python/Pandas

RCA

Elite Member
Gold Supporting Member
Joined
Jun 27, 2017
Professional Status
Certified General Appraiser
State
California
R has a nasty set of problems when your code base starts getting huge: It becomes difficult to find bugs, such as duplicate names in R6 classes. There really isn't anything around that automates the process of finding duplicates - except manually going through the code -- and that gets to be a big nuisance. The R syntax also has other wierd features that make debugging difficult.

In Python, you can use an IDE like PyCharm that will quickly find syntax problems.

Now, I will still "look" at R and maybe use it at times. ....

Disadvantages of Python/Pandas:

1. You will need to do some extra conversion between R and Python data structures. But that is worth the effort.
2. It can be a tad slower. I'll let you know.
3. .... - I'll let you know.

Advantages:
1. Static syntax checking is the big one.

Note: You still need to know R if you use Python.
 
R has a nasty set of problems when your code base starts getting huge: It becomes difficult to find bugs, such as duplicate names in R6 classes. There really isn't anything around that automates the process of finding duplicates - except manually going through the code -- and that gets to be a big nuisance. The R syntax also has other wierd features that make debugging difficult.

In Python, you can use an IDE like PyCharm that will quickly find syntax problems.

Now, I will still "look" at R and maybe use it at times. ....

Disadvantages of Python/Pandas:

1. You will need to do some extra conversion between R and Python data structures. But that is worth the effort.
2. It can be a tad slower. I'll let you know.
3. .... - I'll let you know.

Advantages:
1. Static syntax checking is the big one.

Note: You still need to know R if you use Python.

If you find it to be too slow for your needs, you can try using Polars instead of Pandas.
 
If you find it to be too slow for your needs, you can try using Polars instead of Pandas.

Well, then, assuming you read in your MLS data as Polars data frames you have to convert the Polars data frames to Pandas data frames to R data frames before you can pass the MLS data to earth. Then, after earth is done, you may very well need to convert back in the opposite direction. It does depend on your workflow. But that is a distinct downside of using Python - with R packages (earth is written in R and C++). And don't forget that earth itself calls into many other R libraries.

Where I want to really use Python is in graphing workflows. I can generate a ton of graphs - and the logic gets complicated. So, I could use R to actually read in data and execute earth (i.e.MARS) and the pass off the subsequent graphic and other processing to Python. It is when the code starts getting so massive and complicated that I need Python's static syntax checking to save time and help ensure the code is error free.

On the other hand, if I were to start building models of large areas, like a complete Metro area such as San Jose, - I would be very tempted to handle the initial data processing with Polars or possibly NumPy (which however would be bit inconvenient since NumPy arrays only allow one datatype per array).
 
Find a Real Estate Appraiser - Enter Zip Code

Copyright © 2000-, AppraisersForum.com, All Rights Reserved
AppraisersForum.com is proudly hosted by the folks at
AppraiserSites.com
Back
Top