Abstract:
One of the biggest challenges in the card payments industry is figuring out the purchasing intent of cardholders and key transaction drivers. Prior studies have mostly focused on e-commerce, credit card transactions, and fraud detection, often overlooking other card kinds and different merchant categories. To fill the identified gap, this study utilized a sequential card transactional dataset to analyze cardholder purchasing patterns within four merchant acquiring sectors: restaurants, health care facilities, fuel stations, and social joints. The main objective of this study was to construct a predictive model that can accurately profile cardholders and detect their intent beyond the horizon of fraud detection. Unlike conventional methods such as Naïve Bayes, decision trees, and support vector machines, the proposed model dynamically represents transactional behavior using the Hidden Markov Model. To achieve resilience and flexibility, the methodology employed three HMM problems, i.e., initialization, decoding, and evaluation. Additionally, performance optimization techniques such as feature engineering, principal component analysis (PCA), sensitivity analysis, and 5-fold cross-validation were employed. By integrating the capabilities of the surrogate decision tree model with principal component analysis-transformed Hidden Markov Model outputs, this research introduced a novel computational breakthrough that generates an interpretable framework, linking the predictive power of the Hidden Markov Models with stakeholders' decision-making requirements. This hybrid approach potentially overcomes a major drawback of opaque sequential modeling techniques by enabling stakeholders to understand both the anticipated purchasing behaviors and the causes of particular transactional patterns that lead to specific consumer intent classifications. With 100% accuracy and precision, 99% recall, a 98.5% F1-score, and a ROC-AUC of 0.992, the experimental results exhibited outstanding performance. Conventional models like SVM, decision trees, Naïve Bayes, transformer-based models, and LSTM networks are outperformed by the results. Despite the encouraging outcomes, the study acknowledged a number of important limitations. Four merchant categories are the only ones included in the dataset, which may limit its applicability to the larger payments ecosystem. The near-perfect performance metrics were largely driven by the extensive optimization pipeline, and could potentially pose significant risks of overfitting. Future studies ought to use more broadly applicable methodologies and cover a greater variety of merchant sectors. This study demonstrates how well HMMs predict cardholder behavior, offering merchants and stakeholders insightful information.