6th Users' Conference of IT4Innovations

Name: 6th Users' Conference of IT4Innovations
Start: 2022-11-03T10:00:00+01:00
End: 2022-11-04T14:00:00+01:00
Location: IT4Innovations

3–4 Nov 2022

IT4Innovations

Europe/Prague timezone

Support

pr@it4i.cz

Using large-scale pre-trained models for building speech applications

4 Nov 2022, 09:00

30m

atrium (IT4Innovations)

atrium

IT4Innovations

Studentská 6231/1B 708 00 Ostrava-Poruba

Keynote Users' talks Keynote III

Dr Oldrich Plchot (Brno University of Technology)

Recently, self-supervised Transformer-based models have become an integral part of state-of-the-art speech modeling and are being integrated into many speech applications such as Automatic Speech Recognition (ASR), Speaker Verification (SV), Language Identification (LID), emotion detection, etc. These models are trained on datasets comprising tens or even hundreds of thousands of speech and can reach several hundreds of millions of parameters. In my talk, I will briefly overview their architecture and a self-supervised training paradigm based on masked speech prediction. Later on, I will describe a use case in speaker verification where we use these already pre-trained models, which we subsequently fine-tune to serve as powerful feature extractors for speaker embedding extraction. I will also discuss methods that can be employed for fine-tuning such large models when there is only a relatively small amount of target and labeled data available.

Dr Oldrich Plchot (Brno University of Technology)

There are no materials yet.

6th Users' Conference of IT4Innovations

Support

Using large-scale pre-trained models for building speech applications

atrium

IT4Innovations

Speaker

Description

Author

Presentation materials

Choose timezone

6th Users' Conference of IT4Innovations

Support

Speaker

Description

Author

Presentation materials