<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://cslt.org/mediawiki/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="zh-cn">
		<id>http://cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-02-14</id>
		<title>2014-02-14 - 版本历史</title>
		<link rel="self" type="application/atom+xml" href="http://cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-02-14"/>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-02-14&amp;action=history"/>
		<updated>2026-04-14T05:01:52Z</updated>
		<subtitle>本wiki的该页面的版本历史</subtitle>
		<generator>MediaWiki 1.23.3</generator>

	<entry>
		<id>http://cslt.org/mediawiki/index.php?title=2014-02-14&amp;diff=9171&amp;oldid=prev</id>
		<title>2014年2月14日 (五) 02:33 Cslt</title>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-02-14&amp;diff=9171&amp;oldid=prev"/>
				<updated>2014-02-14T02:33:16Z</updated>
		
		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class='diff diff-contentalign-left'&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;tr style='vertical-align: top;'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black; text-align: center;&quot;&gt;←上一版本&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black; text-align: center;&quot;&gt;2014年2月14日 (五) 02:33的版本&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;第35行：&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;第35行：&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Ready for training 100M data&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Ready for training 100M data&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Ready for training word sense &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Ready for training word sense &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Investigating Senna toolkit from . Intending to implement POS tagging based on word vectors. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Investigating Senna toolkit from &lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;NEC&lt;/ins&gt;. Intending to implement POS tagging based on word vectors. &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Cslt</name></author>	</entry>

	<entry>
		<id>http://cslt.org/mediawiki/index.php?title=2014-02-14&amp;diff=9170&amp;oldid=prev</id>
		<title>Cslt：以内容“== AM development ==  === Sparse DNN ===  * Optimal Brain Damage(OBD).   # GA-based block sparsity  === Efficient DNN training ===  # Asymmetric window: Great improveme...”创建新页面</title>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-02-14&amp;diff=9170&amp;oldid=prev"/>
				<updated>2014-02-14T02:32:08Z</updated>
		
		<summary type="html">&lt;p&gt;以内容“== AM development ==  === Sparse DNN ===  * Optimal Brain Damage(OBD).   # GA-based block sparsity  === Efficient DNN training ===  # Asymmetric window: Great improveme...”创建新页面&lt;/p&gt;
&lt;p&gt;&lt;b&gt;新页面&lt;/b&gt;&lt;/p&gt;&lt;div&gt;== AM development ==&lt;br /&gt;
&lt;br /&gt;
=== Sparse DNN ===&lt;br /&gt;
&lt;br /&gt;
* Optimal Brain Damage(OBD). &lt;br /&gt;
&lt;br /&gt;
# GA-based block sparsity&lt;br /&gt;
&lt;br /&gt;
=== Efficient DNN training ===&lt;br /&gt;
&lt;br /&gt;
# Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting? &lt;br /&gt;
&lt;br /&gt;
=== Multilanguage training===&lt;br /&gt;
&lt;br /&gt;
# Pure Chinese training reached 4.9%&lt;br /&gt;
# Chinese + English reduced to 7.9%&lt;br /&gt;
# English phone set should discriminate beginning phone and ending phone&lt;br /&gt;
# Should set up multilingual network structure which shares low layers but separate languages at high layers&lt;br /&gt;
&lt;br /&gt;
===Engine optimization===&lt;br /&gt;
&lt;br /&gt;
* Decoder RT reached lower than 0.2: HCLG + MKL + icc&lt;br /&gt;
* Investigating LOUDS FST. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Adaptation ===&lt;br /&gt;
&lt;br /&gt;
* Using linear hidden transform reduce WER from 14% to 11%.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Word to Vector==&lt;br /&gt;
&lt;br /&gt;
* Test a training toolkit Standford University, which can involve global information into word2vector training&lt;br /&gt;
* C++ implementation (instead of python) for data pre-processing&lt;br /&gt;
* Ready for training 100M data&lt;br /&gt;
* Ready for training word sense &lt;br /&gt;
* Investigating Senna toolkit from . Intending to implement POS tagging based on word vectors. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==LM development==&lt;br /&gt;
&lt;br /&gt;
===NN LM===&lt;br /&gt;
&lt;br /&gt;
* Word-based and character-based NNLM using google word2vector completed&lt;br /&gt;
* Character-based NNLM completed (6000 characters, 7gram)&lt;br /&gt;
&lt;br /&gt;
===3T Sogou LM===&lt;br /&gt;
&lt;br /&gt;
* split the data into 24 sub sets, train 3gram for each set, prune with 1e-9&lt;br /&gt;
* Merge completed with equal weights&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Embedded development==&lt;br /&gt;
&lt;br /&gt;
* CLG embedded decoder is almost done. Online compiler is on progress.&lt;br /&gt;
* Zhiyong is working on layer-by-layer DNN training.&lt;br /&gt;
&lt;br /&gt;
==Speech QA==&lt;br /&gt;
&lt;br /&gt;
* Use N-best to expand match in QA. Better performance were obtained.&lt;br /&gt;
:* 1-best matches 96/121 &lt;br /&gt;
:* 10-best matches 102/121&lt;br /&gt;
* Use N-best to recover errors in entity check. &lt;br /&gt;
:* Design a non-entity pattern to discover the possible place of an entity&lt;br /&gt;
:* By this position range, search entities within the N-best result&lt;br /&gt;
* Use Pinyin to recover errors in entity check. Future work.&lt;br /&gt;
:* Design a non-entity pattern to discover the possible place of an entity (as above)&lt;br /&gt;
;* Match the Pinying strings of all the entities, and then match the pinyin strings with the entity pinyin&lt;br /&gt;
:* Keep the most matched entity based on Pinyin with a threshold&lt;br /&gt;
:* A bit worse then the original test. &lt;br /&gt;
:* A possible problem is that the LM is over-strong, thus lead to unmatched Pinyin string in acoustic space&lt;br /&gt;
:* Liu rong will provide a weak LM to support the research.&lt;br /&gt;
&lt;br /&gt;
* Investigate some errors in entity-based LM.&lt;br /&gt;
:* Still some errors exist&lt;br /&gt;
:* Running entity-base LM with a small entity list&lt;/div&gt;</summary>
		<author><name>Cslt</name></author>	</entry>

	</feed>