legacy-datasets/multilingual_librispeech
Updated • 162 • 17
How to use FremyCompany/xls-r-nl-v1-cv8-lm with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="FremyCompany/xls-r-nl-v1-cv8-lm") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("FremyCompany/xls-r-nl-v1-cv8-lm")
model = AutoModelForCTC.from_pretrained("FremyCompany/xls-r-nl-v1-cv8-lm")This model is a version of facebook/wav2vec2-xls-r-2b-22-to-16 fine-tuned mainly on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - NL dataset (see details below), on which a small 5-gram language model is added based on the Common Voice training corpus. This model achieves the following results on the evaluation set (of Common Voice 8.0):
The model takes 16kHz sound input, and uses a Wav2Vec2ForCTC decoder with 48 letters to output the final result.
To improve accuracy, a beam decoder is used; the beams are scored based on 5-gram language model trained on the Common Voice 8 corpus.
This model can be used to transcribe Dutch or Flemish spoken dutch to text (without punctuation).
2000 iterations (batch size 32) on the dutch configuration of the multilingual_librispeech dataset.2000 iterations (batch size 32) on the nl configuration of the common_voice_8_0 dataset.6000 iterations (batch size 32) on the cgn dataset.6000 iterations (batch size 32) on the nl configuation of the common_voice_8_0 dataset.