2014-07-18
来自cslt Wiki
目录
Resoruce Building
Leftover questions
- Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test.
- Multi GPU training: Error encountered
- Multilanguage training
- Investigating LOUDS FST.
- CLG embedded decoder plus online compiler.
- DNN-GMM co-training
AM development
Sparse DNN
- GA-based block sparsity (++++++++++)
Noise training
- Journal paper writing on going
Multilingual ASR
LM = Tel201406.HW.v2.1.1 AM\testset |JS27H_100| JS_2h |ShanXi_2h|ShaanXi2h|Unknown2h| ENG | Tel201406.v1.0.S | | - | - | - | - | - | Tel201406.v1.1.S | - | - | - | - | - | - | Tel201406.HW.v2.0.B| 20.18 | 17.49 | 23.85 | 22.81 | 22.48 | 83.28 | Tel201406.HW.v2.0.S| 19.95 | 17.74 | 23.73 | 22.36 | 22.49 | 67.70 | Tel201406.HW.v2.1.B| 19.14 | 16.97 | 24.26 | 22.28 | 22.97 | 85.41 | Tel201406.HW.v2.1.S| 19.44 | 17.62 | 24.49 | 23.06 | 23.60 | 74.61 |
- v1.*: no English words involved.
- v2.*: with English words involved.
Denoising & Farfield ASR
- Sparse linear prediction. Need to correct the model.
- Use xEnt as the adaptation object, instead of MSE based feature mapping
- Use the simulation tool to add reverberation.
- [1]
- Investigate the impact of speech rate. Use Tencent 200h data to conduct the experiments.
- Investigate the correlation between phone speed & entropy.
VAD
- Waiting for engineering work
Scoring
- Refine the acoustic model with AMIDA database. problem solved by involving both wsj and AMIDA.
- Model ready for picking up
Embedded decoder
- The first deliver is Emb201407_BG_v0.0
- Train two smaller network: 500x4+600, 400x4+500
LM development
Domain specific LM
h2. Domain specific LM construction
h3. TAG LM
- Some problems with the tagging. all numbers are tagged.
h3. Chatting LM
- Building chatting lexicon
Word2Vector
W2V based doc classification
- Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.
Semantic word tree
- Version v2.0 released (filter with query log)
- Please deliver to /nfs/disk/perm/data/corpora/semanticTree (Xingchao)
- Version v3.0 under going. Further refinement with Baidu Baike hierarchy
NN LM
- Character-based NNLM (6700 chars, 7gram), 500M data training done.
- Inconsistent pattern in WER were found on Tenent test sets
- probably need to use another test set to do investigation.
- Investigate MS RNN LM training
Speaker ID
- reading materials
- prepare to run sre08
Translation
- Initial version released
- collecting more data (Xinhua parallel text, bible, name entity) for the second version