DeepSpeech - Speech to Text

DeepSpeech now provides precompiled packages via pip, so you won't have to build it yourself anymore.

To install DeepSpeech go ahead and install it with pip:

[~]$ sudo pip3 install deepspeech

Note: DeepSpeech did not run very well on the Raspberry Pi because the language model was too large to fit in memory, and without it DeepSpeech just returns raw phonemes.

Now you need to download the language models for your DeepSpeech version:

  [~]$ wget https://github.com/mozilla/DeepSpeech/releases/download/v0.5.1/deepspeech-0.5.1-models.tar.gz
  • And extract the downloaded file:
  [~]$ tar xvfz deepspeech-0.5.1-models.tar.gz

Great! You just downloaded the pre-trained model from the DeepSpeech project. You should make sure to look at the Common Voice project (https://voice.mozilla.org/en) and consider contributing in order to help with higher quality models in the future.

The last step is to add DeepSpeech to your profile like this:

I recommend using PocketSphinx for passive listening and DeepSpeech for active listening. To use it as the active listener with Naomi, you will need to add a section like this to your profile.yml file:

active_stt:
  engine: deepspeech-stt
deepspeech:
  model: '/home/user/models/output_graph.pb'
  alphabet: '/home/user/models/alphabet.txt'
  language_model: '/home/user/models/lm.binary'
  trie: '/home/user/models/trie'