On modern techniques for parallel waveform generation of speech Industrial seminar

With Lorenzo Foglianti, Papercup

On modern techniques for parallel waveform generation of speech

At Papercup, we aim to translate the world’s content. What this means in practice is to translate audio from an input language to an output language. In this talk, we will focus on what we consider the most interesting part of this problem, which is the function mapping text to audio. Over the past few years, Machine Learning research has made a giant leap forward in the quality of the synthesised audio compared to more traditional methods. However, these methods are inherently autoregressive and therefore cannot be parallelised on modern machines. Because of this, these methods can rarely be deployed in practice. Hence, the synthesis time is limited by the nature of the model, rather than the hardware. In this talk, we present a new class of models, called Flows, which allows us to generate audio in a non autoregressive way. We will also show sample audio synthesised by state of the art models.

Industrial seminar

Add to your calendar or Include in your list

How can mathematics help us to understand the behaviour of ants? Read more about the fanscinating work being carri… https://t.co/iCODvvxqE6 View on Twitter