<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://cslt.org/mediawiki/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="zh-cn">
		<id>http://cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-08-22</id>
		<title>2014-08-22 - 版本历史</title>
		<link rel="self" type="application/atom+xml" href="http://cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-08-22"/>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-08-22&amp;action=history"/>
		<updated>2026-04-15T10:02:36Z</updated>
		<subtitle>本wiki的该页面的版本历史</subtitle>
		<generator>MediaWiki 1.23.3</generator>

	<entry>
		<id>http://cslt.org/mediawiki/index.php?title=2014-08-22&amp;diff=10721&amp;oldid=prev</id>
		<title>Cslt：以内容“==Resoruce Building==  == Leftover questions==  * Investigating LOUDS FST.  * CLG embedded decoder plus online compiler. * DNN-GMM co-training * NN LM  == AM developmen...”创建新页面</title>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-08-22&amp;diff=10721&amp;oldid=prev"/>
				<updated>2014-08-22T02:13:24Z</updated>
		
		<summary type="html">&lt;p&gt;以内容“==Resoruce Building==  == Leftover questions==  * Investigating LOUDS FST.  * CLG embedded decoder plus online compiler. * DNN-GMM co-training * NN LM  == AM developmen...”创建新页面&lt;/p&gt;
&lt;p&gt;&lt;b&gt;新页面&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Resoruce Building==&lt;br /&gt;
&lt;br /&gt;
== Leftover questions==&lt;br /&gt;
&lt;br /&gt;
* Investigating LOUDS FST. &lt;br /&gt;
* CLG embedded decoder plus online compiler.&lt;br /&gt;
* DNN-GMM co-training&lt;br /&gt;
* NN LM&lt;br /&gt;
&lt;br /&gt;
== AM development ==&lt;br /&gt;
&lt;br /&gt;
=== Sparse DNN ===&lt;br /&gt;
* WJS sparse DNN does not obtain further improvement&lt;br /&gt;
&lt;br /&gt;
===Noise training===&lt;br /&gt;
&lt;br /&gt;
:* Noisy training journal paper almost done.&lt;br /&gt;
&lt;br /&gt;
==Drop out &amp;amp; Rectification &amp;amp; convolutive  network==&lt;br /&gt;
&lt;br /&gt;
* Change learning to 0.001, the training process can be started: &lt;br /&gt;
*# check the drop probability &lt;br /&gt;
*# check learning rate&lt;br /&gt;
*# continuous training&lt;br /&gt;
&lt;br /&gt;
* Rectification&lt;br /&gt;
# Rectification itself failed with large weights.&lt;br /&gt;
# Including L1 penalty enables the training but got very bad performance.&lt;br /&gt;
# Try to set the maximum value with rectifier&lt;br /&gt;
&lt;br /&gt;
* Convolutive network&lt;br /&gt;
# Test more configurations &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Denoising &amp;amp; Farfield ASR===&lt;br /&gt;
&lt;br /&gt;
* Lasso-based dereverberation obtained reasonable results&lt;br /&gt;
:# spectrum based lasso outperforms fbank based lasso.&lt;br /&gt;
:# temporal-frequency based lasso outperforms just temporal based lasso.&lt;br /&gt;
:# using 200 frame to estimate utterance-based lasso coefficients is possible, with marginal performance degradation.&lt;br /&gt;
:# using lasso can solve the problem of dynamic reverberation.&lt;br /&gt;
:# Need to investigate static reverberation. &lt;br /&gt;
:# The 1/3 paper has been checked in to cvs. &lt;br /&gt;
&lt;br /&gt;
===VAD===&lt;br /&gt;
&lt;br /&gt;
* Found some problems in Puqiang's speech data. Some files are labelled incorrectly.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Speech rate training===&lt;br /&gt;
&lt;br /&gt;
* Append an additional dimension to the feature vector, indicating the rate of speech &lt;br /&gt;
* The ROS is computed as words per second&lt;br /&gt;
&lt;br /&gt;
===Scoring===&lt;br /&gt;
&lt;br /&gt;
* Refine the acoustic model with AMIDA database. problem solved by involving both wsj and AMIDA.&lt;br /&gt;
&lt;br /&gt;
===Confidence===&lt;br /&gt;
&lt;br /&gt;
* Knowledge prepared&lt;br /&gt;
* First experiment with combining lattice-based confidence and DNN confidence. &lt;br /&gt;
* Further step will add ROS.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Embedded decoder===&lt;br /&gt;
&lt;br /&gt;
* Chatting LM released (80k)&lt;br /&gt;
* Train two smaller network: 500x4+600, 400x4+500: on going&lt;br /&gt;
* Build a new graph with MPE3 am and chatting LM.&lt;br /&gt;
&lt;br /&gt;
==LM development==&lt;br /&gt;
&lt;br /&gt;
===Domain specific LM===&lt;br /&gt;
&lt;br /&gt;
h2. G determinization problem solved&lt;br /&gt;
&lt;br /&gt;
h2. NUM tag LM:&lt;br /&gt;
&lt;br /&gt;
27h JS test:  20.16 vs 20.19&lt;br /&gt;
2h  JS test:  17.48 vs 17.49&lt;br /&gt;
&lt;br /&gt;
h2. Analyze the property of the tag LM: (1) random NUM should obtain better perfomance; (2) other words are not seriously impacted.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Word2Vector==&lt;br /&gt;
&lt;br /&gt;
===W2V based doc classification===&lt;br /&gt;
&lt;br /&gt;
* Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.&lt;br /&gt;
* Interest group setup, reading scheduled every Thusday&lt;br /&gt;
* Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==RNN LM==&lt;br /&gt;
&lt;br /&gt;
* New toolkit from Thomas obtained&lt;br /&gt;
* Need more investigation on the toolkit&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Speaker ID==&lt;br /&gt;
&lt;br /&gt;
* Second model done&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Translation==&lt;br /&gt;
* Failed due to out of memory &lt;br /&gt;
* Re-train the model with limitation on iteration number. Goes to 8th iteration&lt;/div&gt;</summary>
		<author><name>Cslt</name></author>	</entry>

	</feed>