2014-07-18

Resoruce Building

Leftover questions

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test.
Multi GPU training: Error encountered
Multilanguage training
Investigating LOUDS FST.
CLG embedded decoder plus online compiler.
DNN-GMM co-training

AM development

Sparse DNN

GA-based block sparsity (++++++++++)

Noise training

Journal paper writing on going

Multilingual ASR


LM = Tel201406.HW.v2.1.1

     AM\testset    |JS27H_100|  JS_2h  |ShanXi_2h|ShaanXi2h|Unknown2h|   ENG   |
 Tel201406.v1.0.S  |         |    -    |    -    |    -    |    -    |    -    |
 Tel201406.v1.1.S  |    -    |    -    |    -    |    -    |    -    |    -    |
Tel201406.HW.v2.0.B|  20.18  |  17.49  |  23.85  |  22.81  |  22.48  |  83.28  |
Tel201406.HW.v2.0.S|  19.95  |  17.74  |  23.73  |  22.36  |  22.49  |  67.70  |
Tel201406.HW.v2.1.B|  19.14  |  16.97  |  24.26  |  22.28  |  22.97  |  85.41  |
Tel201406.HW.v2.1.S|  19.44  |  17.62  |  24.49  |  23.06  |  23.60  |  74.61  |

v1.*: no English words involved.
v2.*: with English words involved.

Denoising & Farfield ASR

Sparse linear prediction. Need to correct the model.
Use xEnt as the adaptation object, instead of MSE based feature mapping
Use the simulation tool to add reverberation.
[1]
Investigate the impact of speech rate. Use Tencent 200h data to conduct the experiments.
Investigate the correlation between phone speed & entropy.

VAD

Waiting for engineering work

Scoring

Refine the acoustic model with AMIDA database. problem solved by involving both wsj and AMIDA.
Model ready for picking up

Embedded decoder

The first deliver is Emb201407_BG_v0.0
Train two smaller network: 500x4+600, 400x4+500

LM development

Domain specific LM

h2. Domain specific LM construction

h3. TAG LM

Some problems with the tagging. all numbers are tagged.

h3. Chatting LM

Building chatting lexicon

Word2Vector

W2V based doc classification

Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.

Semantic word tree

Version v2.0 released (filter with query log)
Please deliver to /nfs/disk/perm/data/corpora/semanticTree (Xingchao)
Version v3.0 under going. Further refinement with Baidu Baike hierarchy

NN LM

Character-based NNLM (6700 chars, 7gram), 500M data training done.

Inconsistent pattern in WER were found on Tenent test sets
probably need to use another test set to do investigation.

Investigate MS RNN LM training

Speaker ID

reading materials
prepare to run sre08

Translation

Initial version released
collecting more data (Xinhua parallel text, bible, name entity) for the second version

2014-07-18

目录

Resoruce Building

Leftover questions

AM development

Sparse DNN

Noise training

Multilingual ASR

Denoising & Farfield ASR

VAD

Scoring

Embedded decoder

LM development

Domain specific LM

Word2Vector

W2V based doc classification

Semantic word tree

NN LM

Speaker ID

Translation

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具