“2014-05-16”版本间的差异

2014年5月16日 (五) 01:42的最后版本

Resoruce Building

Maxi onboard
Release management should be started: Zhiyong (+)
Blaster 0.1 & vivian 0.0 system release

Leftover questions

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Multi GPU training: Error encountered
Multilanguage training
Investigating LOUDS FST.
CLG embedded decoder plus online compiler.
DNN-GMM co-training

AM development

Sparse DNN

GA-based block sparsity (+++)

Found a paper in 2000 with similar ideas.
Try to get a student working on high performance computing to do the optimization

Noise training

More with-clean training completed. 2 conditions left

GFbank

8k train

GFBank sinovoice 1400 MPE stream

16k train

GFBank sinovoice 6000 MPE1 stream: worse than 1700h (10.18-11.11)

Multilingual ASR

Test sharing scheme:

decision tree share, xent improvement obtained, MPE no improvement (Chinese worse a bit, English a bit better).

English model

                             mic          tel
pure eng                    voxforge    fisher
chinese eng                 shujutang   convert-from-shujutang

Denoising & Farfield ASR

Baseline: close-talk model decode far-field speech: 92.65
Will investigate DAE model.

Kaiser Window

window function test based on 23 Mel channel number  8k wsj databas	
window function	%WER	       ins	del     sub
kaiser	 278 / 5643=4.93	39	15	224
povey	 265 / 5643=4.70	34	14	217

window function test based on 30 Mel channel number  8k wsj databas
window function	%WER	        ins	del	sub
kaiser	 270 / 5643= 4.78	38	17	215
povey	283 / 5643= 5.02 	36	24	223

VAD

DNN-based VAD (24.77) shower better performance than energy based VAD (45.73)

Scoring

online scoring done??
checked into gitlab?

Word to Vector

Paper submitted

LM development

Domain specific LM

Prepare English lexicon

NN LM

Character-based NNLM (6700 chars, 7gram), 500M data training done.

Inconsistent pattern in WER were found on Tenent test sets
probably need to use another test set to do investigation.

Investigate MS RNN LM training

QA

FST-based matching

Word-based FST 1-2 seconds with 1600 patterns. Huilan's implementation <1 second.
THRAX toolkit for grammar to FST

Investigate determinization of G embedding

Refer to Kaldi new code

@@ 第20行： / 第20行： @@
 ===Noise training===
-:* With-clean training done. Much better on clean testing
+:* More with-clean training completed. 2 conditions left
-:* Experiments done. Prepare paper.
 ===GFbank===
-* GFBank sinovoice 1400 MPE stream
+* 8k train
-* GFBank sinovoice 6000 MPE stream
+:* GFBank sinovoice 1400 MPE stream
+* 16k train
+:* GFBank sinovoice 6000 MPE1 stream: worse than 1700h  (10.18-11.11)
 ===Multilingual ASR===
-* MPE-based training is not very sensitive to data imbalance for English & Chinese
+* Test sharing scheme:
-* Data duplication can trade-off the performance of two languages
+:* decision tree share, xent improvement obtained, MPE no improvement (Chinese worse a bit, English a bit better).
-* Test sharing shemes
+===English model===
+<pre>
+                             mic          tel
+pure eng                    voxforge    fisher
+chinese eng                 shujutang   convert-from-shujutang
+</pre>
 ===Denoising & Farfield ASR===
@@ 第38行： / 第49行： @@
 *  Baseline:  close-talk model decode far-field speech: 92.65
 *  Will investigate DAE model.
+===Kaiser Window ===
+<pre>
+window function test based on 23 Mel channel number  8k wsj databas
+window function	%WER	       ins	del     sub
+kaiser	 278 / 5643=4.93	39	15	224
+povey	 265 / 5643=4.70	34	14	217
+window function test based on 30 Mel channel number  8k wsj databas
+window function	%WER	        ins	del	sub
+kaiser	 270 / 5643= 4.78	38	17	215
+povey	283 / 5643= 5.02 	36	24	223
+</pre>
 ===VAD===
-* VAD bug fixed???
+* DNN-based VAD (24.77) shower better performance than energy based VAD (45.73)
-* Test frame VAD accuracy
 ===Scoring===
@@ 第49行： / 第73行： @@
 ==Word to Vector==
-* Paper writing done
+* Paper submitted
 ==LM development==
+===Domain specific LM===
+* Prepare English lexicon
 ===NN LM===

“2014-05-16”版本间的差异

2014年5月16日 (五) 01:42的最后版本

目录

Resoruce Building

Leftover questions

AM development

Sparse DNN

Noise training

GFbank

Multilingual ASR

English model

Denoising & Farfield ASR

Kaiser Window

VAD

Scoring

Word to Vector

LM development

Domain specific LM

NN LM

QA

FST-based matching

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具