Among the many branches of computer science, machine learning has been the most powerful tool I've applied outside the classroom. During graduate school, I learned core ML concepts that later helped me assist my family in building risk assessment models for tenant evictions.
My father had started a business of Real Estate Holdings to lease out units to tenants. Of course there comes the risks of being in the business of leasing, and that is evictions. Evictions can lead to several overhead costs to a business—they are time consuming because several documents have to be filled out and submitted to court, and evicted tenants have a tendency to leave a property in poor shape.
Post the financial crises, the risks of evictions went down in the years 2009 and onwards, but still anyone at any time can be a risk. A lot of tenants (most tenants, but NOT ALL) who were evicted had certain characteristics: lower credit score, higher percentage of monthly expenses relative to income, and a history of defaults. This inspired the idea—what if we could predict risk beforehand?
With just a few rows of data and small pieces of information, I needed insightful features to help increase prediction accuracy. Credit score was an obvious one. I also engineered a ratio of monthly expenses to annual income. The higher this ratio, the more financially strained the tenant likely was. These patterns helped define the predictive input space.
I reframed the data into a binary classification problem: 0 for not evicted, 1 for evicted. I experimented with SVC, Decision Trees, Random Forest, and eventually an ensemble using stacking with XGBoost, Gradient Boosting, and an MLP neural network. The ensemble approach yielded our best results—79% accuracy with balanced precision and recall.
This was my first post-grad practical ML experience. I learned that behind raw data lies meaningful patterns waiting to be engineered. No single model reigns supreme in isolation—but together, they can compensate for each other’s weaknesses.
Now, each applicant’s data is run through our model. Anyone with a class 0 score below 0.65 is rejected. Over the past year, evictions have noticeably decreased. The model isn’t perfect yet, but it’s already made a real impact.
| Model | Accuracy | Precision (Evicted=1) | Recall (Evicted=1) | F1 Score (Evicted=1) |
|---|---|---|---|---|
| SVC | 0.76 | 0.63 | 0.62 | 0.63 |
| Decision Tree | 0.79 | 0.67 | 0.73 | 0.70 |
| Random Forest | 0.79 | 0.66 | 0.74 | 0.70 |
| Gradient Boosting | 0.79 | 0.65 | 0.74 | 0.70 |
| XGBoost | 0.79 | 0.67 | 0.73 | 0.70 |
| MLP (Neural Net) | 0.79 | 0.66 | 0.71 | 0.69 |
| Stacking Classifier | 0.79 | 0.67 | 0.73 | 0.70 |