Cslt：以内容“==Resoruce Building== * Release management has been started * Blaster 0.1 & vivian 0.0 system release == Leftover questions== * Asymmetric window: Great improvement on...”创建新页面

2014-05-23T01:55:40Z

以内容“==Resoruce Building== * Release management has been started * Blaster 0.1 & vivian 0.0 system release == Leftover questions== * Asymmetric window: Great improvement on...”创建新页面

新页面

==Resoruce Building==
* Release management has been started
* Blaster 0.1 & vivian 0.0 system release

== Leftover questions==
* Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
* Multi GPU training: Error encountered
* Multilanguage training
* Investigating LOUDS FST.
* CLG embedded decoder plus online compiler.
* DNN-GMM co-training

== AM development ==

=== Sparse DNN ===
* GA-based block sparsity (++++)

===Noise training===
:* All experiments completed. Combing experiments.

===GFbank===

* WSJ clean condition done. Obtained the same performance as the time domain implementation
* Should experiment with the Tencent training set.

===Multilingual ASR===

* Multilingual LM decoding
* Fixing the non-tag bug

===English model===

<pre>
RESULTS:
(state-gauss = 10000 100000)

1. Shujutang 100h chi-eng 16k:

LM/AM | xEnt | mpe_1 | mpe_2 | mpe_3 | mpe_4 |
--------- --------- --------- --------- --------- ---------
wsj | 23.86 | 20.95 | 20.90 | 20.84 | 20.81 |

2. Shujutang 100H chi-eng 8k:

LM/AM | xEnt | mpe_1 | mpe_2 | mpe_3 | mpe_4 |
--------- --------- --------- --------- --------- ---------
wsj | 26.27 | 23.63 | 23.14 | 22.93 | 23.00 |

3. voxforge pure eng 16k:

LM/AM | xEnt | mpe_1 | mpe_2 | mpe_3 | mpe_4 |
--------- --------- --------- --------- --------- ---------
wsj | 21.38 | 24.89 | 24.50 | 23.31 | 23.13 |

4. fisher pure eng 8k:
Not finish yet.
LM/AM | xEnt | mpe_1 | mpe_2 | mpe_3 | mpe_4 |
--------- --------- --------- --------- --------- ---------
wsj | 40.65 | - | - | - | - |

</pre>

* Need to experiment with Gigabyptes LM
* Need to check the AM settings and LM used in the Kaldi egs/fisher

===Denoising & Farfield ASR===

* Baseline: close-talk model decode far-field speech: 92.65
* Will investigate DAE model

===Kaiser Window ===
* Test on different numbers of Fbanks: no significant difference between 23/30/40 (both 8k/16k) [#223]
* Test on Kaiser & Povey window: no significant difference obtained for both 8k/16k [#224, #225]

===VAD===

* DNN-based VAD (7.49) showers much better performance than energy based VAD (45.74)
* [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/1/1d/Dnn_vad_VS_energy_vad.pdf click here]
* Need to test small scale network
:* 600-800 network
:* 100 X 4 + 2

===Scoring===

* Fixing bug for the stream mode

==LM development==

===Domain specific LM===
* English lexicon done, build HCLG
* Re-build LM with the new lexicon
* Tested on Dianxin dev set

===NN LM===

* Character-based NNLM (6700 chars, 7gram), 500M data training done.
:* Inconsistent pattern in WER were found on Tenent test sets
:* probably need to use another test set to do investigation.

* Investigate MS RNN LM training

==QA==

===FST-based matching===
:* Word-based FST 1-2 seconds with 1600 patterns. Huilan's implementation <1 second.
:* THRAX toolkit for grammar to FST

* Investigate determinization of G embedding
:* Refer to Kaldi new code

2014-05-23 - 版本历史

Cslt：以内容“==Resoruce Building== * Release management has been started * Blaster 0.1 & vivian 0.0 system release == Leftover questions== * Asymmetric window: Great improvement on...”创建新页面