Cslt：以内容“== AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # Online OBD held. # OBD + L1 norm start to investigation. * Efficient computing # Conducti...”创建新页面

2014-01-03T02:20:27Z

以内容“== AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # Online OBD held. # OBD + L1 norm start to investigation. * Efficient computing # Conducti...”创建新页面

新页面

== AM development ==

=== Sparse DNN ===

* Optimal Brain Damage(OBD).

# Online OBD held.
# OBD + L1 norm start to investigation.

* Efficient computing

# Conducting rearrangement the matrix structure and compose zero blocks by some smart approaches, leading to better computing speed.

=== Efficient DNN training ===

# L1-L2 grid checking: L1/L2(< 1e-6) seems good for record1900 but worse for other test sets.

[http://192.168.0.50:3000/series/?action=view&series=199,199.0,199.1,199.2,199.3,199.4,199.5,199.6,199.7,199.8,199.9 link here]

# Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
# Frame-skipping. Skipping 1 frame speeds up decoding in a consistent way while retaining the accuracy largely. Skipping more frames lead to unacceptable performance degradation.
# Interpolation does not provide performance gain.

[http://192.168.0.50:3000/series/?action=view&series=199,199.5,199.6,199.7,199.8,199.9,198,198.0,198.1,198.2,198.3,198.4,198.5,198.6 link here]

=== Optimal phoneset===

* Analyze Tencent English phone set. Found some errors in CH/EN phone sharing.
* Develop a new sharing scheme, start training the new system.
* Start training for all-separated phones
* Start training mixed system with Chinglish data.

===Engine optimization===

* Investigating LOUDS FST. On progress.

==LM development==

===NN LM===

* Collecting a bigger lexicon: 40k words related to music, 56k words from an official dictionary.
* Working on NN LM based on word2vector.

==Embedded development==

* Liuchao's cellphone, Qualcomm Snapdragon Krait MSM8960 @ 1.5GHz, using 1 core
small nnet 100/600/600/600/600/1264 with MFCC input

* 4500 words:

:* construct LG: 0.41s
:* compose HCLG with det: 13.70s, 5.318 MB
:* compose HCLG without det: 6.61s, 5.488 MB

* 950 words:

:* construct LG: 0.15s
:* compose HCLG with det: 2.63s, 0.947 MB, decode RT 0.649
:* compose HCLG without det: 1.74s, 0.998 MB, decode RT 0.548

* For word list or simple grammars, determinization leads to small RT increase, but can improve HCLG compiling dramatically. This is particularly the case for embedded devices.
* The accuracy does not change with/without determinization.

==Speech QA==

* Use N-best to expand match in QA. Better performance were obtained.
:* 1-best matches 96/121
:* 10-best matches 102/121

* Use N-best to recover errors in entity check. Working on.
* Use Pinyin to recover errors in entity check. Future work.

2014-01-03 - 版本历史

Cslt：以内容“== AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # Online OBD held. # OBD + L1 norm start to investigation. * Efficient computing # Conducti...”创建新页面