Changes:
1. Added streaming API support to the GUI tool
2. Minor modifciations to how models are loaded upon repeated transcriptions
3. Updated to Deepspeech v0.3.0
4. Image in the documentation changed
Changes v2:
1. Added streaming support to cmd interface also
Prerequisites
-------------
~/Deepspeech$ sudo apt install virtualenv
~/Deepspeech$ cd examples/vad_transcriber
~/Deepspeech/examples/vad_transcriber$ virtualenv -p python3 venv
~/Deepspeech/examples/vad_transcriber$ source venv/bin/activate
(venv) ~/Deepspeech/examples/vad_transcriber$ pip3 install -r requirements.txt
Command line tool
-----------------
The command line tool processes a wav file of any duration and returns a trancript
which will the saved in the same directory as the input audio file.
(venv) ~/Deepspeech/examples/vad_transcriber
$ python3 audioTranscript_cmd.py --aggressive 1 --audio ./audio/guido-van-rossum.wav --model ./models/0.2.0/
Minimalistic GUI
----------------
The GUI tool does the same job as the CLI tool. The VAD is fixed at an aggressiveness of 1.
The output is displayed in the transcription window and saved into the directory as the input
audio file as well.
(venv) ~/Deepspeech/examples/vad_transcriber
$ python3 audioTranscript_gui.py
Changes(v1):
1. Using Deepspeech python module instead of subprocess
2. Moved VAD code to a module
3. Moved all files to bin/ and renamed README.md to Audio_Transcription.md
Changes(v2):
Renamed files
Changes (v2.1):
1. Refactoring between CMD and GUI code
2. Documenting pre-requisites with a virtualenv
3. Loading model only once per long wav file
4. CMD and GUI tool do the same job, perform VAD and consolidate the output.
5. Chunks are not saved in the disk. Using a numpy interger array to store them.
Changes (v2.2):
1. Argparse module for command line arguments
2. Everything in virtualenv, with a requirements.txt
3. Older APIs aligned with 0.2.0 release
4. Moved all files into examples/vad_transcriber
Changes (v2.3)
1. Updated requirements.txt