Abstract:
Speech coding is important for effective storage and transmission of audio signals. However,
current Interactive Multimedia Association Adaptive Differential Pulse Code Modulation (IMA-ADPCM)
speech coding techniques that use a fixed predictor have an impact on the encoding of dynamic and
non-stationary speech signals. The limitation of the fixed predictor in IMA-ADPCM speech coding is the
motivation for this study. Our goal is to improve the fixed predictor by integrating a GRU predictor that
can adapt to and make better predictions of dynamic speech signals. We evaluated the performance of the
IMA-ADPCMencodingbaselineandtheGRUpredictorembeddedwiththeIMA-ADPCMcodecalgorithm.
The proposed pre-trained GRU predictor based encoding system outperformed the maximum Signal-to
Noise Ratio (SNR) (43.2 dB and MOS scores 3.8 to 4.3) of 5.0, and our results demonstrated considerable
improvements in audio quality. The main contribution of this study is the development of a GRU Predictor
that integrates IMA-ADPCM coding algorithms according to the IMA-ADPCM output speech sample and
the actual PCM speech sample dataset required. By integrating the GRU predictor model in accordance with
these data samples, the newly designed algorithm significantly improved the quality of the IMA-ADPCM
speech codec