2014-07-05
来自cslt Wiki
目录
Resoruce Building
Leftover questions
- Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test.
- Multi GPU training: Error encountered
- Multilanguage training
- Investigating LOUDS FST.
- CLG embedded decoder plus online compiler.
- DNN-GMM co-training
AM development
Sparse DNN
- GA-based block sparsity (+++++++++)
Noise training
- Journal paper writing on going
Multilingual ASR
HW 27h (HW TR LM not involved) HW27h (HW TR LM involved) Fbank stream (monolang) 21.64 20.72 FBank non-stream (MPE4) 22.23 21.38 FBank stream (MPE4) 21.99 -
Denoising & Farfield ASR
- Reverberant data delivered
- global CMN based spectrum checking done. Seems the signal/feature transform with DNN is not a very reasonable waycheck here.
VAD
- Waiting for engineering work
Scoring
- Refine the acoustic model with AMIDA database. problem solved by involving both wsj and AMIDA.
Embedded decoder
- WER vs RT vs graph size done.
- The first deliver is Emb201407_BG_v0.0
- Demo done
LM development
Domain specific LM
h2. Domain specific LM construction
h3. Mixture LM
- TAG model: 127h HuaWei tag analysis done.
- Performance on the NUM-tagged model under testing.
Word2Vector
W2V based doc classification
- Good performance obtained with the SSA (semantic space allocation). That is, train a general GMM, and then represent each doc as the vector of the GMM weight.
- APSIPA paper submitted
Semantic word tree
- Version v2.0 released (filter with query log)
- Please deliver to /nfs/disk/perm/data/corpora/semanticTree (Xingchao)
- Version v3.0 under going. Further refinement with Baidu Baike hierarchy
NN LM
- Character-based NNLM (6700 chars, 7gram), 500M data training done.
- Inconsistent pattern in WER were found on Tenent test sets
- probably need to use another test set to do investigation.
- Investigate MS RNN LM training