View from Magstræde, Copenhagen, Denmark. Photo: Ozan Aygun
Predicting housing market in Iowa:
extreme gradient boosting solution to a traditional regression problem
Today I feel like it is time again to tackle a regression problem - this time a traditional one!
Let's get our feet wet by Ames housing data set to perform a fairly comprehensive analysis and
predictive modeling to estimate house prices. Here, I have extensively explored the training set,
then performed feature engineering, missing value imputation and feature selection using the training set.
Finally, I trained both linear models such as lasso regularization, PCA regression, as well as more
complex algorithms including gradient boosting, extreme gradient boosting, random forest, and support vector machines.
I have obtained a model that predicts house sale prices fairly well, with a RMSE of 0.12717 obtained
from the test data set. Feel free to fork the reproducible code from my GitHub page and improve the model with your solution!