“2014-06-03”版本间的差异

2014年6月3日 (二) 02:28的最后版本

Resoruce Building

Release management has been started

Leftover questions

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Multi GPU training: Error encountered
Multilanguage training
Investigating LOUDS FST.
CLG embedded decoder plus online compiler.
DNN-GMM co-training

AM development

Sparse DNN

GA-based block sparsity (+++++)

Noise training

All experiments completed.
Paper writing will be started this week

GFbank

Test on Tencent database is done. Better performance observed than Fbank
Equal-loudness pre-filter added, slightly better performance was obtained
Running into Sinovoice 8k 1400 + 100 mixture training. 9 xEnt iteration completed.

Multilingual ASR

Multilingual LM decoding
non-tag bug investigation with some digital string recordings
Revert to hanzi numbers

English model


(state-gauss = 10000 100000, various LM, beam 13)

1. Shujutang 100h chi-eng 16k:

  LM/AM  |  xEnt   |  mpe_1  |  mpe_2  |  mpe_3  |  mpe_4  |
--------- --------- --------- --------- --------- ---------
   wsj   |  23.86  |  20.95  |  20.90  |  20.84  |  20.81  |
   cmu   |  22.22  |    -    |    -    |    -    |  18.83  |
   giga  |  21.77  |    -    |    -    |    -    |  18.61  |
  armid  |  20.45  |    -    |    -    |    -    |    -    |


2. Shujutang 100H chi-eng 8k:

  LM/AM  |  xEnt   |  mpe_1  |  mpe_2  |  mpe_3  |  mpe_4  |
--------- --------- --------- --------- --------- ---------
   wsj   |  26.27  |  23.63  |  23.14  |  22.93  |  23.00  |
   cmu   |  24.11  |    -    |    -    |    -    |  20.36  |
   giga  |  23.11  |    -    |    -    |    -    |  20.11  |
  armid  |    -    |    -    |    -    |    -    |    -    |


3. voxforge pure eng 16k:

  LM/AM  |  xEnt   |  mpe_1  |  mpe_2  |  mpe_3  |  mpe_4  |
--------- --------- --------- --------- --------- ---------
   wsj   |  21.38  |  24.89  |  24.50  |  23.31  |  23.13  |
   cmu   |  24.00  |    -    |    -    |    -    |  21.33  |
   giga  |  18.75  |    -    |    -    |    -    |  22.45  |
  armid  |    -    |    -    |    -    |    -    |    -    |

4. fisher pure eng 8k:
Not finish yet.
  LM/AM  |  xEnt   |  mpe_1  |  mpe_2  |  mpe_3  |  mpe_4  |
--------- --------- --------- --------- --------- ---------
   wsj   |  40.65  |  36.16  |  35.94  |  35.88  |  35.80  |
   cmu   |  35.07  |    -    |    -    |    -    |  31.16  |
   giga  |  41.18  |    -    |    -    |    -    |  36.23  |
  armid  |    -    |    -    |    -    |    -    |    -    |

Denoising & Farfield ASR

Investigating DAE model
Kaldi-based MSE obj training toolkit preparation

VAD

DNN-based VAD (7.49) showers much better performance than energy based VAD (45.74)
Need to test small scale network (+)

600-800 network
100 X 4 + 2

Scoring

Bug for the stream mode fixed

Embedded decoder

word list graph test passed
wlist2LG toolkit checked in
Prepare to deliver Android compiler options (.mk)
Interface design should be completed in one day
Prepare HCLG for 20k LM, decoding on progress.

LM development

Domain specific LM

Retrieve both Baidu & microblog
PPL testing
Need to check into gitLab.

NN LM

Character-based NNLM (6700 chars, 7gram), 500M data training done.

Inconsistent pattern in WER were found on Tenent test sets
probably need to use another test set to do investigation.

Investigate MS RNN LM training

@@ 第29行： / 第29行： @@
 * Multilingual LM decoding
-* Fixing the non-tag bug ???
+* non-tag bug investigation with some digital string recordings
+* Revert to hanzi numbers
 ===English model===
@@ 第106行： / 第108行： @@
 ===Domain specific LM===
-* English lexicon done, build HCLG
-* Re-build LM with the new lexicon
-* Tested on Dianxin dev set
+* Retrieve both Baidu & microblog
+* PPL testing
+* Need to check into gitLab.
 ===NN LM===
@@ 第118行： / 第120行： @@
 * Investigate MS RNN LM training
-==QA==
-===FST-based matching===
-:* Word-based FST 1-2 seconds with 1600 patterns. Huilan's implementation <1 second.
-:* THRAX toolkit for grammar to FST
-* Investigate determinization of G embedding
-:* Refer to Kaldi new code

“2014-06-03”版本间的差异

2014年6月3日 (二) 02:28的最后版本

目录

Resoruce Building

Leftover questions

AM development

Sparse DNN

Noise training

GFbank

Multilingual ASR

English model

Denoising & Farfield ASR

VAD

Scoring

Embedded decoder

LM development

Domain specific LM

NN LM

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具