“2014-01-10”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以内容“== AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # Online OBD held. # OBD + L1 norm start to investigation. * Efficient computing # Conducti...”创建新页面)
 
Speech QA
第50行: 第50行:
 
* Use N-best to recover errors in entity check. Working on.
 
* Use N-best to recover errors in entity check. Working on.
 
* Use Pinyin to recover errors in entity check. Future work.
 
* Use Pinyin to recover errors in entity check. Future work.
 +
* Investigate some errors in entity-based LM.

2014年1月10日 (五) 01:30的版本

AM development

Sparse DNN

  • Optimal Brain Damage(OBD).
  1. Online OBD held.
  2. OBD + L1 norm start to investigation.
  • Efficient computing
  1. Conducting rearrangement the matrix structure and compose zero blocks by some smart approaches, leading to better computing speed.


Efficient DNN training

  1. Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
  2. Fbank feature used to train GMM+DNN, leads to very high training Acc, but reduces accuracy on test.
  3. Fbank energy trim. With threshold=10, we can get performance gain in a safe way. Larger thresholds may lead to performance reduction.here

Optimal phoneset

  • Ch/En training with concatenated phone set is completed.
  • Errors in testing.

Engine optimization

  • Investigating LOUDS FST. On progress.


LM development

NN LM

  • Working on NN LM based on word2vector.
  • Reading more materials on word2vector.

Embedded development

  • Liuchao's cellphone, Qualcomm Snapdragon Krait MSM8960 @ 1.5GHz, using 1 core

small nnet 100/600/600/600/600/1264 with MFCC input

  • Work on layer-by-layer DNN training

Speech QA

  • Use N-best to expand match in QA. Better performance were obtained.
  • 1-best matches 96/121
  • 10-best matches 102/121
  • Use N-best to recover errors in entity check. Working on.
  • Use Pinyin to recover errors in entity check. Future work.
  • Investigate some errors in entity-based LM.