“2014-08-01”版本间的差异
来自cslt Wiki
(以内容“==Resoruce Building== == Leftover questions== * Investigating LOUDS FST. * CLG embedded decoder plus online compiler. * DNN-GMM co-training * NN LM == AM developmen...”创建新页面) |
|||
第33行: | 第33行: | ||
* By tuning parameters of late-response lag & response time, obtained performance improvement with Lasso. | * By tuning parameters of late-response lag & response time, obtained performance improvement with Lasso. | ||
+ | <pre> | ||
Simulation results: Baseline: | Simulation results: Baseline: | ||
-------------------------------------------------------------------------- | -------------------------------------------------------------------------- | ||
第68行: | 第69行: | ||
-------------------------------------------------------------------------- | -------------------------------------------------------------------------- | ||
+ | </pre> | ||
* Adaptation under running | * Adaptation under running |
2014年8月1日 (五) 01:53的最后版本
目录
Resoruce Building
Leftover questions
- Investigating LOUDS FST.
- CLG embedded decoder plus online compiler.
- DNN-GMM co-training
- NN LM
AM development
Sparse DNN
- WJS sparse DNN shows a slightly better than non-sparse cases when the network is in a large scale
- Pre-training does work for DNN training (for both 4/5/6 layers)
Noise training
- Journal paper writing on going
Multilingual ASR
- Native English speaker + Chinglish speaker obtained better performance.
Drop out & convolutional network
- Change learning to 0.001, the training process can be started.
- Frame Accuracy goes to : (with/without drop probability normalization)
Denoising & Farfield ASR
- By tuning parameters of late-response lag & response time, obtained performance improvement with Lasso.
Simulation results: Baseline: -------------------------------------------------------------------------- model/test | far_evl92 | near_evl92 -------------------------------------------------------------------------- clean_ce | 59.38 | 19.25 mpe_clean_ce | 40.46 | 12.94 -------------------------------------------------------------------------- Lasso with optimal parameters(lambda=0.05, delta=5, N=10) -------------------------------------------------------------------------- model/test | far_evl92 | near_evl92 -------------------------------------------------------------------------- clean_ce | 54.63 | 15.75 -------------------------------------------------------------------------- mpe_clean_ce | 36.58 | 11.64 -------------------------------------------------------------------------- Real data results: -------------------------------------------------------------------------- model/test | far_evl92 | near_evl92 -------------------------------------------------------------------------- clean_ce | 94.86 | 63.48 -------------------------------------------------------------------------- mpe_clean_ce | 92.29 | 58.37 -------------------------------------------------------------------------- dereverberated recording : -------------------------------------------------------------------------- model/test | far_evl92 | near_evl92 -------------------------------------------------------------------------- clean_ce | 94.91 | 61.03 -------------------------------------------------------------------------- mpe_clean_ce | 91.28 | 54.16 --------------------------------------------------------------------------
- Adaptation under running
VAD
- Waiting for testing results
Scoring
- Refine the acoustic model with AMIDA database. problem solved by involving both wsj and AMIDA.
Confidence
- Be familiar with Kaldi
- Need to extract lattice and DNN features
Embedded decoder
- Chatting LM released (80k)
- Train two smaller network: 500x4+600, 400x4+500: on going
- Need to upload the new client code onto git (+)
- Build a new graph with MPE3 am and chatting LM.
LM development
Domain specific LM
h2. Domain specific LM construction
h3. TAG LM
- TAG obtained better performance
h3. Chatting LM
- First version released (80k lexicon)
- Prepare 2nd released (120k lexicon)
- Test on Xiaotang long
Word2Vector
W2V based doc classification
- Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.
Speaker ID
- Full-data SRE trial goes into the final stage
- results will be ready soon
Translation
- collecting more data (Xinhua parallel text, bible, name entity) for the second version
- check possible parameters to control phrase pair lexicon