CategoryComputer Science
KeywordsSpeech, Artificial Bandwidth Extension, Audio Compression, Synthesis

Application

Audio and video streaming formed the vast majority of internet traffic in 2021. Most of the speech enhancement, denoising, dereverberation, and bandwidth extension methods focus on filtering out or masking unwanted noises, while assuming the recorded speech sounds clear. However, the average consumer source stream comes from a low fidelity microphone or poorly treated acoustic spaces. Therefore, the current methods struggle to reconstruct a clear-sounding natural voice.

Another major issue is an acute need to extremely compress and then reconstruct the data stream in real-time with no delay.

Our Innovation

The researcher developed a real-time method to tackle both extreme compression and high-quality generation for the domain of speech and audio. 

The researcher investigates methods for:

  • A representation that would be fast and suitable for synthesis
  • A real-time high-fidelity speech and audio synthesizer 

The developed model will allow: i) generating high-quality speech and audio from lightweight compressed representation; ii) improving speech and audio compression rates

Opportunity

The research group is open to academic collaborations and commercial implementations with telecommunication companies.

Papers & Preliminary Results:

Aero: Audio Super Resolution in the Spectral Domain