Abstract:
Social media has become an increasingly important part of our daily lives in the last few years. With the convenience built into smart devices, many new ways of monitoring and predicting products sales have been made possible via social-media applications. An area of substantial research is that of predicting product sales, such as books, video games and movie tickets. There are a number of prediction models that have been used to predict future sales however these models attempt to solve the problem by making assumptions. These models assume that independent variables are truly independent. In theory, there should be zero correlation between any of the independent variables. In practice, however, many variables are related, sometimes quite highly. Therefore, different prediction techniques/methods have been and are being researched on and proposed to address this drawback. The aim of this study was to identify ways of improving prediction of product sales in mobile phones. Consequently, the study realized a predictive model that could classify sentiments from social media by combining natural language processing and the predictive model to compute the probability and present an improved predictive model. The process involved analyses of sentiments from Facebook and Twitter. A predictive model was created that performed classification on 300 annotated Facebook and Twitter sentiments. We compared the result of our model against open source model such as Markov model. The naïve bayes-model recorded a total precision of 93.33% while the receiver operating characteristic curve was 97%. The model predicted 150 of the sentiments belong to preference class No with precision of 96.43%. This means that the model correctly predicted the sentiments to be in class No with 96.43% accuracy. We therefore conclude from the receiver curve that the performance of the model used in this study to analyze data is acceptable and hence the posterior probabilities generated are informative. Markov model recorded a total precision of 91.67% while the receiver operating characteristics curve was 97.86%. The model predicted 161 of the sentiments belong to preference class No with precision of 98.57%. This means that the model correctly predicted the sentiments to be in class No with 98.57% accuracy. When we compare the two models naïve bayes is better because it has a high precision of 93.33% while Markov had a precision of 91.67%. The results obtained from experiments with the model indicate that it is capable of performing classification with an accuracy of 93.33% for sentiments obtained from Social Media. This is near human accuracy, as apparently people agree on sentiment only around 80% of the time. Most of the sentiments in this data are expressed partly in informal language. It can therefore be concluded that the model of classification has proved to be very accurate and efficient in predicting sales in e-commerce. This will assist the phone manufacturing companies in predicting the future levels of sales of their products.