It's been a few months since I have built DeepSpeech (today is August 13th, 2018), so these instructions probably need to be updated. They are for building DeepSpeech on Debian or a derivative, but should be fairly easy to translate to other systems by just changing the package manager and package names.
[~]$ sudo apt-get install pkg-config zip g++ zlib1g-dev unzip
Apparently you need bazel in order to build bazel, so download a copy and unzip it
[~]$ wget https://github.com/bazelbuild/bazel/releases/download/0.4.5/bazel-0.4.5-dist.zip [~]$ unzip -d bazel-0.4.5-dist bazel-0.4.5-dist.zip [~]$ cd bazel-0.4.5-dist
The script is in 555 mode when you unzip it, so you have to make it writable
[~/bazel-0.4.5-dist]$ chmod a+w scripts/bootstrap/compile.sh
If you are building for the Raspberry Pi, do these things (note that DeepSpeech did not run very well on the Raspberry Pi the last time I checked. The language model was too large to fit in memory, and without it DeepSpeech just returns raw phonemes):
- vi scripts/bootstrap/compile.sh
- Goto line 117 and add -J-Xmx500M
- Save and quit
It also sounds like maybe the cc_configure.bzl script has some trouble figuring out the platform, so you give it a little help by editing the _get_cpu_value function and just tell it to always return “arm”
Now build it
[~/bazel-0.4.5-dist]$ ./compile.sh Build successful! Binary is here: ~/bazel-0.4.5-dist/output/bazel [~/bazel-0.4.5-dist]$ sudo cp -iv output/bazel /usr/local/bin/ [~/bazel-0.4.5-dist]$ cd .. [~]$ git clone https://github.com/mozilla/tensorflow.git [~]$ git clone https://github.com/mozilla/DeepSpeech.git [~]$ cd tensorflow/ [~/tensorflow]$ ln -s ../DeepSpeech/native_client/ ./ [~/tensorflow]$ ./configure [~/tensorflow]$ bazel build -c opt --copt=-O3 //native_client:libctc_decoder_with_kenlm.so [~/tensorflow]$ bazel build --config=monolithic -c opt --copt=-O3 --copt=-fvisibility=hidden --incompatible_load_argument_is_label=false //native_client:libdeepspeech.so //native_client:deepspeech_utils //native_client:libctc_decoder_with_kenlm.so //native_client:generate_trie [~/tensorflow]$ cd native_client [~/tensorflow/native_client]$ make deepspeech [~/tensorflow/native_client]$ PREFIX=/usr/local sudo make install [~/tensorflow/native_client]$ make bindings [~/tensorflow/native_client]$ pip install dist/deepspeech-*.whl
I recommend using PocketSphinx for passive listening and DeepSpeech for active listening. To use it as the active listener with Naomi, you will need to add a section like this to your profile.yml file:
active_stt: engine: deepspeech-stt deepspeech: model: '/home/user/models/output_graph.pb' alphabet: '/home/user/models/alphabet.txt' language_model: '/home/user/models/lm.binary' trie: '/home/user/models/trie'