Skip to main contentSkip to search barSkip to navigationSkip to footer
Logo of the University of Applied Sciences and Arts Northwestern Switzerland
Degree Programmes
Continuing Education
Research and Services
International
About FHNW
DeEn
Locations and ContactFHNW LibraryMedia Relations
Logo of the University of Applied Sciences and Arts Northwestern Switzerland
  • Degree Programmes
  • Continuing Education
  • Research and Services
  • International
  • About FHNW
DeEn
Locations and ContactFHNW LibraryMedia Relations
Sc...
FHNW School of Engineeri...
Ins...
Researc...
Speech Recognition for S...

Speech Recognition for Swiss German

The FHNW Institute for Data Science is working on speech recognition technology to transform speech in various Swiss dialects to Standard German text.

Try our speech recognition for Swiss German yourself:

Current state of the art

Speech recognition for English or Standard German works fairly well and is already part of our daily lives with Alexa, Siri etc. Unfortunately, this is not the case for Swiss German. The main reasons are the diversity of dialects, the lack of a standardized writing system and the small number of speakers. While solutions for specific use cases, e.g. a scenario with a restricted domain and only one dialect, are available, they are expensive and not reusable.

Goals

The goal of this project is to create a speech recognition system that works for all domains and dialects. This would considerably decrease costs and enable many new applications, including voice assistants, transcription of meetings or phone calls and voice-controlled robots.

Results

Our approach is based on the latest research in Deep Learning and Natural Language Processing. We trained a single combined speech recognition and translation model to directly translate Swiss German speech to Standard German text. This requires a huge amount of training data, i.e. hundreds or thousands of hours of spoken Swiss German sentences aligned to the corresponding Standard German text.

To acquire enough data, we developed an alignment procedure to automatically extract sentence-level speech-text-pairs from long speech recordings and their text transcripts, e.g. parliament discussions with detailed transcripts. Details can be found in our paper. Also, we published a dataset created using this procedure. It can be downloaded here. Our latest model achieves a word error rate of 15 % and a BLEU score of 72 on a test set with speakers from all major Swiss German dialect regions. Try our demo here to put the model’s recognition and translation skills to the test.

Project information

Partner

FHNW Institute for Data Science, SwissNLP, ZHAW, Universität Zürich

Project team

Prof. Dr. Manfred Vogel, Christian Scheller, Claudio Paonessa, Michel Plüss, Yanick Schraner

About FHNW

Institute for Data Science
Manfred Vogel

Prof. Dr. sc. nat. Manfred Vogel, dipl. Math. ETH

Head Information Processing

Telephone

+41 56 202 77 36 (undefined)

E-mail

manfred.vogel@fhnw.ch

Address

FHNW University of Applied Sciences and Arts Northwestern Switzerland School of Computer Sciencw Bahnhofstrasse 6 CH-5210 Windisch

ht_ins_i4ds_projekt_teaserht_ins_i4ds_projekte_gesellschaftht_stu_projekt_ds

What we offer

  • Degree Programmes
  • Continuing Education
  • Research and Services

About FHNW

  • Schools
  • Organisation
  • Management
  • Facts and Figures

Information

  • Data Protection
  • Accessibility
  • Imprint

Support & Intranet

  • IT Support
  • Login Inside-FHNW

Member of: