“2014-01-10”版本间的差异
来自cslt Wiki
(以内容“== AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # Online OBD held. # OBD + L1 norm start to investigation. * Efficient computing # Conducti...”创建新页面) |
(→Speech QA) |
||
第50行: | 第50行: | ||
* Use N-best to recover errors in entity check. Working on. | * Use N-best to recover errors in entity check. Working on. | ||
* Use Pinyin to recover errors in entity check. Future work. | * Use Pinyin to recover errors in entity check. Future work. | ||
+ | * Investigate some errors in entity-based LM. |
2014年1月10日 (五) 01:30的版本
目录
AM development
Sparse DNN
- Optimal Brain Damage(OBD).
- Online OBD held.
- OBD + L1 norm start to investigation.
- Efficient computing
- Conducting rearrangement the matrix structure and compose zero blocks by some smart approaches, leading to better computing speed.
Efficient DNN training
- Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
- Fbank feature used to train GMM+DNN, leads to very high training Acc, but reduces accuracy on test.
- Fbank energy trim. With threshold=10, we can get performance gain in a safe way. Larger thresholds may lead to performance reduction.here
Optimal phoneset
- Ch/En training with concatenated phone set is completed.
- Errors in testing.
Engine optimization
- Investigating LOUDS FST. On progress.
LM development
NN LM
- Working on NN LM based on word2vector.
- Reading more materials on word2vector.
Embedded development
- Liuchao's cellphone, Qualcomm Snapdragon Krait MSM8960 @ 1.5GHz, using 1 core
small nnet 100/600/600/600/600/1264 with MFCC input
- Work on layer-by-layer DNN training
Speech QA
- Use N-best to expand match in QA. Better performance were obtained.
- 1-best matches 96/121
- 10-best matches 102/121
- Use N-best to recover errors in entity check. Working on.
- Use Pinyin to recover errors in entity check. Future work.
- Investigate some errors in entity-based LM.