Commit Graph

176 Commits

Author SHA1 Message Date
Reuben Morais
904ab1e288 Centralize progress logging and progress bar logic 2019-04-16 11:06:26 -03:00
Reuben Morais
9586fbbd30 Rename --train_cached_features_path to --feature_cache 2019-04-16 11:06:26 -03:00
Reuben Morais
911a1ce4b1 Do separate test epochs if multiple input files are specified 2019-04-16 11:06:26 -03:00
Reuben Morais
a85af3da49 Do separate validation epochs if multiple input files are specified 2019-04-16 11:01:38 -03:00
Reuben Morais
6a0c186b5c Correct mistake in len check
X-DeepSpeech: NOBUILD
2019-04-11 12:07:04 -03:00
Reuben Morais
13757a4258 Fix pylint warnings 2019-04-11 07:02:21 -03:00
Reuben Morais
6ab91f37ec Don't calculate dataset size by hand, use tf.errors.OutOfRangeError 2019-04-08 16:18:15 -03:00
Reuben Morais
5779d298e1 Merge branch 'more-metadata' 2019-04-05 14:38:56 -03:00
Reuben Morais
5b80f21668 Rename language flag 2019-04-05 11:54:02 -03:00
Reuben Morais
7f6fd8b48b Embed more metadata in exported model and read it in native client 2019-04-05 09:35:23 -03:00
Reuben Morais
97c36291af Rename epoch flag to epochs 2019-04-05 09:30:50 -03:00
Reuben Morais
2f3f095048 Ignore epochs in checkpoints, always start epoch count from zero 2019-04-05 00:21:04 -03:00
Reuben Morais
5ee856d075 Clarify early stopping dependency on validation 2019-04-04 22:41:38 -03:00
Reuben Morais
ed15caf3c5 Check if train/dev/test files were passed in instead of having explicit flags 2019-04-04 22:41:38 -03:00
Reuben Morais
1cea2b0fe8 Rewrite input pipeline to use tf.data API 2019-04-02 18:31:32 -03:00
Tilman Kamp
a179a2389f Fix #1986 - Remove distributed training support 2019-04-01 18:43:22 +02:00
Tilman Kamp
42f04dc9aa Fix #1972 2019-03-21 13:39:12 +01:00
Tilman Kamp
6c6a4e08ca Fix #1962 2019-03-18 18:53:04 +01:00
josh
584312540b skip file header of CSV file when reading alphabet 2019-03-05 00:55:32 +01:00
Alexandre Lissy
7ef82af74f Force comments on output to allow piping
Fixes #1882
2019-02-19 18:46:29 +01:00
Daniel Winkler
3bc2f8bdd5
Fixed typo in relu_clip description. 2019-02-19 11:33:28 +01:00
Reuben Morais
f5aa47b0a8
Merge pull request #1872 from mozilla/issue1738
Fix shape of loaded preprocessed features (Fixes #1738 and #1772)
2019-02-13 11:03:53 -02:00
Josh Meyer
4b2e3bc714
Merge pull request #1874 from mozilla/check-chars
--alphabet-format flag
2019-02-13 01:02:45 +01:00
josh
a56d968b73 quotes 2019-02-13 01:01:28 +01:00
josh
de932752c5 --alphabet-format flag will print found alphabet to terminal and can by copy-pasted into alphabet.txt. Also now script makes use of argparse. 2019-02-12 23:57:06 +01:00
Reuben Morais
c1212ffbb2 Fix shape of loaded preprocessed features 2019-02-12 20:31:24 -02:00
josh
32bf1a685a prettier error message to the terminal when alphabets are mis-matched. 2019-02-12 23:10:46 +01:00
Alexandre Lissy
c0cd365544 Add TFLite accuracy estimation tool
Fixes #1852
2019-02-12 13:03:20 +01:00
Reuben Morais
12c62756c7 Switch wer_cer_batch to compute real CER over corpus 2019-02-05 09:29:47 -02:00
Reuben Morais
f3613da82a Use tf.contrib.layers.dense_to_sparse instead of util/ctc.py 2019-02-04 09:19:48 -02:00
Reuben Morais
7a14bcc4de Clean up and split TensorFlow deps of text.py 2019-02-04 08:35:43 -02:00
Reuben Morais
fa7cb1a983 Update decoder hyperparameters 2018-12-28 16:12:09 -02:00
Alexandre Lissy
0e617ea6f1 Fetch ds_ctcdecoder from VERSION-based URL by default
Fixes #1801
2018-12-19 14:50:10 +01:00
Reuben Morais
9ae2e5b3b2 Use a random seed that overfits LDC93S1 in 75 epochs 2018-12-11 22:44:34 -02:00
Reuben Morais
1df9602c95 Use longer MFCC step instead of throwing away features. 2018-12-10 11:08:48 -02:00
Jan Engelmohr
18ef670b05 Character checking: be more verbose if something fails
This way one can easily debug what file causes problems
2018-12-03 12:08:19 +01:00
Jan Engelmohr
113e582b8a
Preprocessing: use all available threads
...the limitation to 8 threads looks a bit random to me.
2018-11-18 21:39:45 +01:00
Rob
b6e74c17bc
missing numpy 2018-11-13 14:06:30 -05:00
Rob
c6e06d0642
missing self argument 2018-11-13 11:03:09 -05:00
Alexandre Lissy
17b7cddaba Run CTC Decoder build on merge and expose artifacts 2018-11-13 14:50:41 +01:00
Reuben Morais
d125acfb3d
Merge pull request #1696 from mozilla/remove-old-ctc
Remove old CTC decoder (Fixes #1675)
2018-11-12 15:55:32 -02:00
Reuben Morais
cfed8ccd4f Cache common objects in decoder build 2018-11-12 14:17:30 -02:00
Reuben Morais
5cb1aff531 Rename config singleton from C to Config 2018-11-11 18:03:52 -02:00
Reuben Morais
a5bcecbe40 Address more review comments 2018-11-11 15:24:31 -02:00
Reuben Morais
c65c22fe31 Move globals handling out of util/coordinator.py 2018-11-09 21:08:03 -02:00
Reuben Morais
38b54479a9 Add documentation and remove unneeded build of decoder packages 2018-11-09 18:36:21 -02:00
Josh Meyer
76e81f34ff
Merge pull request #1715 from JRMeyer/check-alphabet
Check for transcript & alphabet mis-match
2018-11-09 09:31:42 -08:00
Reuben Morais
56dc024d29 Centralize WER report code into evaluate.py, call it from DeepSpeech.py 2018-11-09 00:26:50 -02:00
josh
3c274d4f62 reubens comments, kept newlines because prettier 2018-11-08 14:16:41 -08:00
josh
5a8f88d922 added error message to text.py when the transcripts and alphabet.txt file dont match, and a file to find the unique character set from the {train/dev/test} csv files 2018-11-07 10:04:22 -08:00