Reuben Morais
699e4ebcd7
Revert to a pipelined approach for test epochs to avoid CPU OOM with large alphabets
2019-05-13 23:49:14 -03:00
Reuben Morais
13757a4258
Fix pylint warnings
2019-04-11 07:02:21 -03:00
Tilman Kamp
42f04dc9aa
Fix #1972
2019-03-21 13:39:12 +01:00
Tilman Kamp
6c6a4e08ca
Fix #1962
2019-03-18 18:53:04 +01:00
josh
32bf1a685a
prettier error message to the terminal when alphabets are mis-matched.
2019-02-12 23:10:46 +01:00
Reuben Morais
12c62756c7
Switch wer_cer_batch to compute real CER over corpus
2019-02-05 09:29:47 -02:00
Reuben Morais
7a14bcc4de
Clean up and split TensorFlow deps of text.py
2019-02-04 08:35:43 -02:00
Josh Meyer
76e81f34ff
Merge pull request #1715 from JRMeyer/check-alphabet
...
Check for transcript & alphabet mis-match
2018-11-09 09:31:42 -08:00
josh
3c274d4f62
reubens comments, kept newlines because prettier
2018-11-08 14:16:41 -08:00
josh
5a8f88d922
added error message to text.py when the transcripts and alphabet.txt file dont match, and a file to find the unique character set from the {train/dev/test} csv files
2018-11-07 10:04:22 -08:00
Reuben Morais
bb4551caa9
Extend Python Alphabet with config file and decode method
2018-11-02 14:00:11 -03:00
Reuben Morais
9c338b76db
Fix handling of Unicode messages when using custom alphabets ( Fixes #849 )
2017-09-26 14:26:41 -03:00
Reuben Morais
f3ea690b38
Make sure custom alphabet code works properly on Python 2 ( #806 )
2017-09-01 11:21:01 +02:00
Reuben Morais
1c4cbf1813
Support custom alphabet mappings ( Fixes #692 ) ( #797 )
...
Support custom alphabet mappings
2017-08-31 11:51:15 +02:00
Reuben Morais
23fa1f71a5
Don't duplicate spaces in the source text when converting to integer labels
2017-08-02 17:00:30 -03:00
Reuben Morais
2da9e7849f
Process corpora in a single pass when possible, and save their definition in CSV files for performance and cleaner code.
2017-04-14 17:21:48 -03:00
Mike C. Fletcher
bae4989660
Further fixes to get the initial test run to finish
2017-03-27 00:53:23 -04:00
Alexandre Lissy
3838c0a9ce
Upgrade to run on Tensorflow 1.0.0
2017-02-22 14:48:47 +01:00
Reuben Morais
d6fb444287
Convert code comments to Sphinx RST docstrings
2017-02-02 23:42:36 -02:00
Andre Natal
32a436309e
Switchboard importer
2017-01-03 12:17:02 -08:00
Reuben Morais
33c9521a6f
Add validation and cleanup function to util/text.py
2016-12-21 14:37:21 -02:00
Kelly Davis
0707ad89d2
Merge branch 'master' into issue109_inputops
2016-11-09 12:26:18 +01:00
Reuben Morais
d989e8de09
Make sure the initializer passed to tf.scan doesn't break the API contract
...
We need to make sure the initializer shape matches the return value
of the callable passed to tf.scan.
This also adds an assertion on the shape of labels and the values
in label_lengths that enforces a condition that is needed for
ctc_label_dense_to_sparse to work.
2016-11-08 12:19:42 -02:00
Chris Lord
6178c31a20
Write a Tensorflow Serving client
2016-11-08 11:45:28 +01:00
Reuben Morais
182e20187a
Switch importers to new input pipeline
2016-11-08 02:35:50 -02:00
Reuben Morais
ca98c5aab8
Expose text_to_char_array in util/text.py
2016-11-07 16:12:58 -02:00
Tilman Kamp
0849efd6da
Fix #111 ; documented and revisited WER calculation
2016-11-01 16:11:04 +01:00
Kelly Davis
43303a2199
Fix #67
...
WER is calculated using Levenshtein distance on chars, not words
2016-10-17 12:48:20 -04:00
Kelly Davis
f8c0b57578
Fixed typo in wers
2016-10-17 08:47:27 -04:00
Kelly Davis
a3abc9d92a
Merge of pull requests #49 , #50 , and #52 . Fixes issues #2 , #4 , #11 , #12 , #46 , #47 , and #48
2016-10-13 15:15:39 -04:00