Integrating Automatic Speech Recognition and Emotion Detection: A Conformer-XGBoost Framework for Human-Centered
Published in Journal of Artificial Intelligence and Capsule Networks, 2025
This paper presents an integrated approach to speech processing by combining a Conformer-based speech recognition system with an XGBoost-driven emotion classification component. The proposed solution aims to simultaneously transcribe spoken utterances and estimate the speaker’s emotional state, enabling more context-sensitive human-machine interaction. Experimental results demonstrate the feasibility of this unified approach and provide a foundation for developing more empathic and adaptive voice systems.
Recommended citation: K C, M. B., Adhikari, S., & Thapa, T. B. (2025). "Integrating Automatic Speech Recognition and Emotion Detection: A Conformer-XGBoost Framework for Human-Centered." Journal of Artificial Intelligence and Capsule Networks 7, no. 4.
Download Paper
