Improved Adaptive Boosting in Heterogeneous Ensembles for Outlier Detection: Prioritizing Minimization of Bias, Variance and Order of Base Learners

Bii, Joash Kiprotich

JKUAT Repository Home
→
Theses and Dissertations
→
College of Pure and Applied Sciences (COPAS)
→
View Item

dc.contributor.author	Bii, Joash Kiprotich
dc.date.accessioned	2023-05-29T07:51:42Z
dc.date.available	2023-05-29T07:51:42Z
dc.date.issued	2023-05
dc.identifier.uri	http://localhost/xmlui/handle/123456789/6107
dc.description	Doctor of Philosophy in Computer Science	en_US
dc.description.abstract	Real-world data suffer from corruption caused by human errors, for instance, rounding errors, wrong measurements, biases, faults, or rare events, including malicious activities like credit card fraud or cyber activities that cause unusual patterns or outliers in data. The detection of outliers is a difficult task that requires complex ensemble models. The ideal outlier detection ensemble should assess the strengths and optimize the results of its base detectors while carefully combining their outputs to create a robust overall model and achieve unbiased accuracy with minimal variance. Existing outlier detection ensembles fuse numerous detectors (weak learners) in either parallel or sequential order to increase detection accuracy by obtaining a combined result through a majority vote. However, trusting the results of all weak learners may deteriorate overall ensemble performance as some learners may produce erroneous results depending on the types of data and their underlying rules. The general objective was to develop an outlier detection model by integrating multiple yet different (heterogeneous) base detectors into one model (ensemble), by first selecting highly accurate base detectors through training and evaluating every model by their error rates, and then implementing the adaptive boosting technique, where misclassified samples got to be feedback for the next detector (to minimize bias), then strategically combining all their decisions (to minimize variance), in order to obtain a strong detector by a combination function. The research’s specific objectives were: identifying weak learners by analyzing their initial biases and variances, analyzing fusion strategies, developing and evaluating an outlier detection model with a focus on minimizing bias, variance, and order of base learners. The CRISP-DM methodology was employed. Outlier datasets were drawn from ODDS library. The model was validated against four other baselines, and test results were compared using performance measures such as Recall, Precision, ROC and AUC values. The experiments showed improvement in results in at least 8 out of ten datasets in terms of average AUCROC even when the least of outliers (single cases up to 10%) were used.	en_US
dc.description.sponsorship	Dr. Richard Rimiru, PhD JKUAT, Kenya Prof. Waweru Ronald Mwangi, PhD JKUAT, Kenya	en_US
dc.language.iso	en	en_US
dc.publisher	JKUAT-COPAS	en_US
dc.subject	Outliers	en_US
dc.subject	Weak learners	en_US
dc.subject	Ensembles	en_US
dc.subject	Bias	en_US
dc.subject	Variance	en_US
dc.title	Improved Adaptive Boosting in Heterogeneous Ensembles for Outlier Detection: Prioritizing Minimization of Bias, Variance and Order of Base Learners	en_US
dc.type	Thesis	en_US

Files in this item

Name: Bii, Joash Kiprotich ...

Size: 3.525Mb

Format: PDF

Description: THESIS

View/Open

This item appears in the following Collection(s)

College of Pure and Applied Sciences (COPAS) [404]
Depts. in this collection Mathematics, Chemistry, Physics. ICT, Biochemistry, Microbiology

Improved Adaptive Boosting in Heterogeneous Ensembles for Outlier Detection: Prioritizing Minimization of Bias, Variance and Order of Base Learners

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account