Speech Recognition for Swiss German

The FHNW Institute for Data Science is working on speech recognition technology to transform speech in various Swiss dialects to Standard German text.

Try our speech recognition for Swiss German yourself:

GO TO DEMO

Current state of the art

Speech recognition for English or Standard German works fairly well and is already part of our daily lives with Alexa, Siri etc. Unfortunately, this is not the case for Swiss German. The main reasons are the diversity of dialects, the lack of a standardized writing system and the small number of speakers. While solutions for specific use cases, e.g. a scenario with a restricted domain and only one dialect, are available, they are expensive and not reusable.

Goals

The goal of this project is to create a speech recognition system that works for all domains and dialects. This would considerably decrease costs and enable many new applications, including voice assistants, transcription of meetings or phone calls and voice-controlled robots.

Results

Our approach is based on the latest research in Deep Learning and Natural Language Processing. We trained a single combined speech recognition and translation model to directly translate Swiss German speech to Standard German text. This requires a huge amount of training data, i.e. hundreds or thousands of hours of spoken Swiss German sentences aligned to the corresponding Standard German text.

To acquire enough data, we developed an alignment procedure to automatically extract sentence-level speech-text-pairs from long speech recordings and their text transcripts, e.g. parliament discussions with detailed transcripts. Details can be found in our paper. Also, we published a dataset created using this procedure. It can be downloaded here. Our latest model achieves a word error rate of 15 % and a BLEU score of 72 on a test set with speakers from all major Swiss German dialect regions. Try our demo here to put the model’s recognition and translation skills to the test.

Project information

Partner	FHNW Institute for Data Science, SwissNLP, ZHAW, Universität Zürich
Project team	Prof. Dr. Manfred Vogel, Christian Scheller, Claudio Paonessa, Michel Plüss, Yanick Schraner