|Keywords||Speech, Artificial Bandwidth Extension, Audio Compression, Synthesis|
Audio and video streaming formed the vast majority of internet traffic in 2021. Most of the speech enhancement, denoising, dereverberation, and bandwidth extension methods focus on filtering out or masking unwanted noises, while assuming the recorded speech sounds clear. However, the average consumer source stream comes from a low fidelity microphone or poorly treated acoustic spaces. Therefore, the current methods struggle to reconstruct a clear-sounding natural voice.
Another major issue is an acute need to extremely compress and then reconstruct the data stream in real-time with no delay.
The researcher developed a real-time method to tackle both extreme compression and high-quality generation for the domain of speech and audio.
The researcher investigates methods for:
- A representation that would be fast and suitable for synthesis
- A real-time high-fidelity speech and audio synthesizer
The developed model will allow: i) generating high-quality speech and audio from lightweight compressed representation; ii) improving speech and audio compression rates
The research group is open to academic collaborations and commercial implementations with telecommunication companies.
Papers & Preliminary Results:
Aero: Audio Super Resolution in the Spectral Domain