Abstract:
This research project sets out to apply statistical techniques in the valuation of land and
properties through various models. It focused on comparing predictive accuracies of mass
valuation models with a dataset of 500 single-family property transactions in two
neighbourhoods within Nairobi city. There are a number of statistical models that are used
for mass valuation of properties. The first step in this study was to gather data on property
sales used in the development of a base model and proposed model. Each of the property
units in the database were geocoded and vectorized. The data was screened and visualized
to investigate the nature of the potential association between the response, Y, and predictor
variables, X. The predictor variables were tested for multicollinearity and a regression model
developed based on hypothesized relationships. The model was tested for lack of fit by
ordering the residuals using a residual scatterplots and histograms. Finally, the fitness
statistics were reviewed by looking at the spread of the plot and evaluating observed values
around the regression line, and examining how accurate the independent variables are in
predicting the dependent variables. The results revealed an overall level of 0.96 for Komarock
and 0.98 for Runda estate respectively. One measure of how well the model predicts is to
compute the correlation between the actual values in the holdout sample and the predicted
values. The correlation should be high when the model is valid. The correlation between the
assessed value and the actual selling price is 0.71 and 0.98 for Komarock and Runda
respectively. Determining the quality of the valuation output also requires measuring
uniformity: uniformity between groups of properties and uniformity within groups (Abidoye,
Huang, Amidu, & Javad, 2021). The coefficient of dispersion (COD) is the most used measure
of valuation uniformity. The results show a COD of 18% and 10% for Komarock and Runda
respectively.