| dc.contributor.author | Gebremichael, Sheferaw Kibret | |
| dc.date.accessioned | 2026-03-05T11:50:12Z | |
| dc.date.available | 2026-03-05T11:50:12Z | |
| dc.date.issued | 2026-03-05 | |
| dc.identifier.citation | GebremichaelKS2026 | en_US |
| dc.identifier.uri | http://localhost/xmlui/handle/123456789/6909 | |
| dc.description | PhD in Information Technology | en_US |
| dc.description.abstract | Speech coding is essential for the effective transmission and storage of audio signals, but the conventional, fixed, linear prediction model of the Interactive Multimedia Association Adaptive Differential Pulse Code Modulation (IMA-ADPCM) codec struggles with the dynamic, non-stationary nature of human speech, thereby limiting coding quality. This thesis addresses this critical limitation by proposing and implementing a novel system that embeds a Gated Recurrent Unit (GRU)-based neural network predictor directly within the standard IMA-ADPCM codec architecture. The core methodological contribution is the development of this GRU-IMA-ADPCM codec system and an innovative training approach that leverages paired Pulse-Code Modulation (PCM) speech samples and ADPCM predictor outputs to optimize the GRU's ability to capture complex, non-linear temporal dependencies for better signal reconstruction. Using the DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus, the system was evaluated across three experimental configurations: the baseline fixed-predictor IMA-ADPCM, an online learning GRU predictor, and batch learning based GRU predictive model embedded with IMA-ADPCM speech Decoder. The batch learning based GRU predictor with IMA-ADPCM system consistently demonstrated methodological superiority, outperforming traditional methods in both objective and subjective evaluations. Specifically, objective metrics showed Signal-to-Noise Ratio (SNR) values reached as high as 45 dB, and subjective evaluations confirmed enhanced perceived quality, achieving Mean Opinion Scores (MOS) between 3.8 and 4.3. This work's key contribution is the development and validation of this computationally efficient, integrated GRU predictor algorithm, which significantly enhances speech coding accuracy and provides a robust solution for real-time speech signal processing. | en_US |
| dc.description.sponsorship | Prof. Waweru Mwangi, PhD JKUAT, Kenya Dr. Michael Kimwele, PhD JKUAT, Kenya Dr. Adane Mamuye, PhD Addis Ababa University Institute of Technology, Ethiopia | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | COPAS- JKUAT | en_US |
| dc.subject | Speech Coding | en_US |
| dc.subject | GRU Predictor | en_US |
| dc.subject | Multimedia Association | en_US |
| dc.subject | Pulse Code Modulation (IMA-ADPCM) Codec System | en_US |
| dc.title | Enhancing Speech Coding Quality by Embedding the GRU Predictor with the Interactive Multimedia Association Adaptive Differential Pulse Code Modulation (IMA-ADPCM) Codec System | en_US |
| dc.type | Thesis | en_US |