<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://cslt.org/mediawiki/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="zh-cn">
		<id>http://cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-05-23</id>
		<title>2014-05-23 - 版本历史</title>
		<link rel="self" type="application/atom+xml" href="http://cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-05-23"/>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-05-23&amp;action=history"/>
		<updated>2026-04-15T01:04:36Z</updated>
		<subtitle>本wiki的该页面的版本历史</subtitle>
		<generator>MediaWiki 1.23.3</generator>

	<entry>
		<id>http://cslt.org/mediawiki/index.php?title=2014-05-23&amp;diff=9993&amp;oldid=prev</id>
		<title>Cslt：以内容“==Resoruce Building== * Release management has been started * Blaster 0.1 &amp; vivian 0.0 system release  == Leftover questions== * Asymmetric window: Great improvement on...”创建新页面</title>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-05-23&amp;diff=9993&amp;oldid=prev"/>
				<updated>2014-05-23T01:55:40Z</updated>
		
		<summary type="html">&lt;p&gt;以内容“==Resoruce Building== * Release management has been started * Blaster 0.1 &amp;amp; vivian 0.0 system release  == Leftover questions== * Asymmetric window: Great improvement on...”创建新页面&lt;/p&gt;
&lt;p&gt;&lt;b&gt;新页面&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Resoruce Building==&lt;br /&gt;
* Release management has been started&lt;br /&gt;
* Blaster 0.1 &amp;amp; vivian 0.0 system release&lt;br /&gt;
&lt;br /&gt;
== Leftover questions==&lt;br /&gt;
* Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting? &lt;br /&gt;
* Multi GPU training: Error encountered&lt;br /&gt;
* Multilanguage training&lt;br /&gt;
* Investigating LOUDS FST. &lt;br /&gt;
* CLG embedded decoder plus online compiler.&lt;br /&gt;
* DNN-GMM co-training&lt;br /&gt;
&lt;br /&gt;
== AM development ==&lt;br /&gt;
&lt;br /&gt;
=== Sparse DNN ===&lt;br /&gt;
* GA-based block sparsity (++++)&lt;br /&gt;
&lt;br /&gt;
===Noise training===&lt;br /&gt;
:* All experiments completed. Combing experiments. &lt;br /&gt;
&lt;br /&gt;
===GFbank===&lt;br /&gt;
&lt;br /&gt;
* WSJ clean condition done. Obtained the same performance as the time domain implementation&lt;br /&gt;
* Should experiment with the Tencent training set. &lt;br /&gt;
&lt;br /&gt;
===Multilingual ASR===&lt;br /&gt;
&lt;br /&gt;
* Multilingual LM decoding&lt;br /&gt;
* Fixing the non-tag bug&lt;br /&gt;
&lt;br /&gt;
===English model===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
RESULTS:&lt;br /&gt;
(state-gauss = 10000 100000)&lt;br /&gt;
&lt;br /&gt;
1. Shujutang 100h chi-eng 16k:&lt;br /&gt;
&lt;br /&gt;
  LM/AM  |  xEnt   |  mpe_1  |  mpe_2  |  mpe_3  |  mpe_4  |&lt;br /&gt;
--------- --------- --------- --------- --------- ---------&lt;br /&gt;
   wsj   |  23.86  |  20.95  |  20.90  |  20.84  |  20.81  |&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
2. Shujutang 100H chi-eng 8k:&lt;br /&gt;
&lt;br /&gt;
  LM/AM  |  xEnt   |  mpe_1  |  mpe_2  |  mpe_3  |  mpe_4  |&lt;br /&gt;
--------- --------- --------- --------- --------- ---------&lt;br /&gt;
   wsj   |  26.27  |  23.63  |  23.14  |  22.93  |  23.00  |&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
3. voxforge pure eng 16k:&lt;br /&gt;
&lt;br /&gt;
  LM/AM  |  xEnt   |  mpe_1  |  mpe_2  |  mpe_3  |  mpe_4  |&lt;br /&gt;
--------- --------- --------- --------- --------- ---------&lt;br /&gt;
   wsj   |  21.38  |  24.89  |  24.50  |  23.31  |  23.13  |&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
4. fisher pure eng 8k:&lt;br /&gt;
Not finish yet.&lt;br /&gt;
  LM/AM  |  xEnt   |  mpe_1  |  mpe_2  |  mpe_3  |  mpe_4  |&lt;br /&gt;
--------- --------- --------- --------- --------- ---------&lt;br /&gt;
   wsj   |  40.65  |    -    |    -    |    -    |    -    |&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Need to experiment with Gigabyptes LM&lt;br /&gt;
* Need to check the AM settings and LM used in the Kaldi egs/fisher&lt;br /&gt;
&lt;br /&gt;
===Denoising &amp;amp; Farfield ASR===&lt;br /&gt;
&lt;br /&gt;
*  Baseline:  close-talk model decode far-field speech: 92.65&lt;br /&gt;
*  Will investigate DAE model&lt;br /&gt;
&lt;br /&gt;
===Kaiser Window ===&lt;br /&gt;
* Test on different numbers of Fbanks: no significant difference between 23/30/40 (both 8k/16k) [#223]&lt;br /&gt;
* Test on Kaiser &amp;amp; Povey window: no significant difference obtained for both 8k/16k [#224, #225]&lt;br /&gt;
&lt;br /&gt;
===VAD===&lt;br /&gt;
&lt;br /&gt;
* DNN-based VAD (7.49) showers much better performance than energy based VAD (45.74)&lt;br /&gt;
* [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/1/1d/Dnn_vad_VS_energy_vad.pdf click here]&lt;br /&gt;
* Need to test small scale network &lt;br /&gt;
:* 600-800 network&lt;br /&gt;
:* 100 X 4 + 2&lt;br /&gt;
&lt;br /&gt;
===Scoring===&lt;br /&gt;
&lt;br /&gt;
* Fixing bug for the stream mode&lt;br /&gt;
&lt;br /&gt;
==LM development==&lt;br /&gt;
&lt;br /&gt;
===Domain specific LM===&lt;br /&gt;
* English lexicon done, build HCLG&lt;br /&gt;
* Re-build LM with the new lexicon&lt;br /&gt;
* Tested on Dianxin dev set&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===NN LM===&lt;br /&gt;
&lt;br /&gt;
* Character-based NNLM (6700 chars, 7gram), 500M data training done.&lt;br /&gt;
:* Inconsistent pattern in WER were found on Tenent test sets&lt;br /&gt;
:* probably need to use another test set to do investigation. &lt;br /&gt;
&lt;br /&gt;
* Investigate MS RNN LM training&lt;br /&gt;
&lt;br /&gt;
==QA==&lt;br /&gt;
&lt;br /&gt;
===FST-based matching===&lt;br /&gt;
:* Word-based FST 1-2 seconds with 1600 patterns. Huilan's implementation &amp;lt;1 second.&lt;br /&gt;
:* THRAX toolkit for grammar to FST&lt;br /&gt;
&lt;br /&gt;
* Investigate determinization of G embedding &lt;br /&gt;
:* Refer to Kaldi new code&lt;/div&gt;</summary>
		<author><name>Cslt</name></author>	</entry>

	</feed>