Automatic Speech Recognition
Overview
This project combines three AI capabilities: speech-to-text recognition, emotion detection from audio, and automatic text summarization. The system is designed to work with diverse accents in challenging acoustic environments and interpret emotional content from spoken language.
Key Features
- Speech Recognition: Conformer model for ASR (Automatic Speech Recognition)
- Emotion Detection: XGBoost for Speech Emotion Recognition (SER)
- Text Summarization: BART Large for automatic text summarization
- Designed for diverse accents in challenging acoustic environments
Technologies
- Models: Conformer, XGBoost, BART Large
- Training Data: Established speech datasets
Use Cases
- Customer service analytics
- Meeting transcription and analysis
- Media content analysis
- Educational applications
