mirror of https://github.com/mozilla/DeepSpeech.git synced 2025-10-26 11:19:39 +00:00

History

Igor Fritzsch e8169160b6 Improved Nodejs streaming inference with VAD and FFmpeg		2019-03-18 14:32:33 +01:00
..
index.js	Improved Nodejs streaming inference with VAD and FFmpeg	2019-03-18 14:32:33 +01:00
package.json	Bump VERSION to 0.4.1	2019-01-10 14:41:00 +01:00
README.MD	Improved Nodejs streaming inference with VAD and FFmpeg	2019-03-18 14:32:33 +01:00

README.MD

FFmpeg VAD Streaming

Streaming inference from arbitrary source (FFmpeg input) to DeepSpeech, using VAD (voice activity detection). A fairly simple example demonstrating the DeepSpeech streaming API in Node.js.

This example was successfully tested with a mobile phone streaming a live feed to a RTMP server (nginx-rtmp), which then could be used by this script for near real time speech recognition.

Installation

npm install

Moreover FFmpeg must be installed:

sudo apt-get install ffmpeg

Usage

Here is an example for a local audio file:

node ./index.js --audio <AUDIO_FILE> \
                --model $HOME/models/output_graph.pbmm \
                --alphabet $HOME/models/alphabet.txt

Here is an example for a remote RTMP-Stream:

node ./index.js  --audio rtmp://<IP>:1935/live/teststream \
                 --model $HOME/models/output_graph.pbmm \
                 --alphabet $HOME/models/alphabet.txt

Examples

Real time streaming inference with DeepSpeech's example audio (audio-0.4.1.tar.gz).

node ./index.js --audio $HOME/audio/2830-3980-0043.wav \
                --lm $HOME/models/lm.binary \
                --trie $HOME/models/trie \
                --model $HOME/models/output_graph.pbmm \
                --alphabet $HOME/models/alphabet.txt

node ./index.js --audio $HOME/audio/4507-16021-0012.wav \
                --lm $HOME/models/lm.binary \
                --trie $HOME/models/trie \
                --model $HOME/models/output_graph.pbmm \
                --alphabet $HOME/models/alphabet.txt

node ./index.js --audio $HOME/audio/8455-210777-0068.wav \
                --lm $HOME/models/lm.binary \
                --trie $HOME/models/trie \
                --model $HOME/models/output_graph.pbmm \
                --alphabet $HOME/models/alphabet.txt

Real time streaming inference in combination with a RTMP server.

node ./index.js --audio rtmp://<HOST>/<APP>/<KEY> \
                --lm $HOME/models/lm.binary \
                --trie $HOME/models/trie \
                --model $HOME/models/output_graph.pbmm \
                --alphabet $HOME/models/alphabet.txt

Notes

To get the best result mapped on to your own scenario, it might be helpful to adjust the parameters VAD_MODE and DEBUNCE_TIME.