<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://cslt.org/mediawiki/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="zh-cn">
		<id>http://cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-03-14</id>
		<title>2014-03-14 - 版本历史</title>
		<link rel="self" type="application/atom+xml" href="http://cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-03-14"/>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-03-14&amp;action=history"/>
		<updated>2026-04-15T00:49:05Z</updated>
		<subtitle>本wiki的该页面的版本历史</subtitle>
		<generator>MediaWiki 1.23.3</generator>

	<entry>
		<id>http://cslt.org/mediawiki/index.php?title=2014-03-14&amp;diff=9364&amp;oldid=prev</id>
		<title>Cslt：/* Speech QA */</title>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-03-14&amp;diff=9364&amp;oldid=prev"/>
				<updated>2014-03-14T02:38:30Z</updated>
		
		<summary type="html">&lt;p&gt;‎&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Speech QA&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table class='diff diff-contentalign-left'&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;tr style='vertical-align: top;'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black; text-align: center;&quot;&gt;←上一版本&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black; text-align: center;&quot;&gt;2014年3月14日 (五) 02:38的版本&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;第151行：&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;第151行：&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* Addding song names and singer names improve performance in most cases&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* Addding song names and singer names improve performance in most cases&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* There indeed some exceptions in the figure that (a) higher WER does not reduce QA necessarily (b) adding entity names does not improve QA &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* There indeed some exceptions in the figure that (a) higher WER does not reduce QA necessarily (b) adding entity names does not improve QA &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;−&lt;/td&gt;&lt;td style=&quot;color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* The results on [[Music_QA_wer.pdf]]&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* The results on [[&lt;ins class=&quot;diffchange diffchange-inline&quot;&gt;媒体文件:&lt;/ins&gt;Music_QA_wer.pdf]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Cslt</name></author>	</entry>

	<entry>
		<id>http://cslt.org/mediawiki/index.php?title=2014-03-14&amp;diff=9363&amp;oldid=prev</id>
		<title>2014年3月14日 (五) 02:38 Cslt</title>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-03-14&amp;diff=9363&amp;oldid=prev"/>
				<updated>2014-03-14T02:38:06Z</updated>
		
		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class='diff diff-contentalign-left'&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;col class='diff-marker' /&gt;
				&lt;col class='diff-content' /&gt;
				&lt;tr style='vertical-align: top;'&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black; text-align: center;&quot;&gt;←上一版本&lt;/td&gt;
				&lt;td colspan='2' style=&quot;background-color: white; color:black; text-align: center;&quot;&gt;2014年3月14日 (五) 02:38的版本&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;第151行：&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;第151行：&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* Addding song names and singer names improve performance in most cases&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* Addding song names and singer names improve performance in most cases&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* There indeed some exceptions in the figure that (a) higher WER does not reduce QA necessarily (b) adding entity names does not improve QA &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:* There indeed some exceptions in the figure that (a) higher WER does not reduce QA necessarily (b) adding entity names does not improve QA &amp;#160;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt;&amp;#160;&lt;/td&gt;&lt;td class='diff-marker'&gt;+&lt;/td&gt;&lt;td style=&quot;color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;:* The results on [[Music_QA_wer.pdf]]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class='diff-marker'&gt;&amp;#160;&lt;/td&gt;&lt;td style=&quot;background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Cslt</name></author>	</entry>

	<entry>
		<id>http://cslt.org/mediawiki/index.php?title=2014-03-14&amp;diff=9361&amp;oldid=prev</id>
		<title>Cslt：以内容“==Resoruce Building== * Current text resource has been re-arranged and listed  == AM development ==  === Sparse DNN ===  * Optimal Brain Damage(OBD).   # GA-based block...”创建新页面</title>
		<link rel="alternate" type="text/html" href="http://cslt.org/mediawiki/index.php?title=2014-03-14&amp;diff=9361&amp;oldid=prev"/>
				<updated>2014-03-14T02:34:17Z</updated>
		
		<summary type="html">&lt;p&gt;以内容“==Resoruce Building== * Current text resource has been re-arranged and listed  == AM development ==  === Sparse DNN ===  * Optimal Brain Damage(OBD).   # GA-based block...”创建新页面&lt;/p&gt;
&lt;p&gt;&lt;b&gt;新页面&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Resoruce Building==&lt;br /&gt;
* Current text resource has been re-arranged and listed&lt;br /&gt;
&lt;br /&gt;
== AM development ==&lt;br /&gt;
&lt;br /&gt;
=== Sparse DNN ===&lt;br /&gt;
&lt;br /&gt;
* Optimal Brain Damage(OBD). &lt;br /&gt;
&lt;br /&gt;
# GA-based block sparsity&lt;br /&gt;
&lt;br /&gt;
=== Efficient DNN training ===&lt;br /&gt;
&lt;br /&gt;
# Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting? &lt;br /&gt;
&lt;br /&gt;
===Multi GPU training===&lt;br /&gt;
* Error encountered&lt;br /&gt;
&lt;br /&gt;
===GMM - DNN co-training===&lt;br /&gt;
* Initial DNN test done&lt;br /&gt;
:* tri4b - &amp;gt; DNN   (org)&lt;br /&gt;
:* DNN alignmenment -&amp;gt; tri4b&lt;br /&gt;
:* tri4b alignment -&amp;gt; DNN (re-train)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  model/testcase              |  test_dev93(cv)       |     test_eval92&lt;br /&gt;
    --------------------------------------------------------------&lt;br /&gt;
    8400-80000(org)           |    7.41               |      4.13&lt;br /&gt;
    --------------------------------------------------------------&lt;br /&gt;
    re-train (Keep state #)   |    7.20               |      4.24&lt;br /&gt;
    --------------------------------------------------------------&lt;br /&gt;
    re-train (Free state #)   |    7.29               |      4.31&lt;br /&gt;
    --------------------------------------------------------------&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Multilanguage training===&lt;br /&gt;
&lt;br /&gt;
# Pure Chinese training reached 4.9%&lt;br /&gt;
# Chinese + English reduced to 7.9%&lt;br /&gt;
# English phone set should discriminate beginning phone and ending phone&lt;br /&gt;
# Should set up multilingual network structure which shares low layers but separate languages at high layers&lt;br /&gt;
&lt;br /&gt;
===Noise training===&lt;br /&gt;
&lt;br /&gt;
* Train with wsj database by corrupting data with various noise types&lt;br /&gt;
:* Almost all training conditions are completed&lt;br /&gt;
:* Interesting results in multi-conditional training (white + cafe) and test on park/station&lt;br /&gt;
&lt;br /&gt;
===AMR compression re-training===&lt;br /&gt;
* WeChat uses AMR compression method, which requires adaptation for our AM&lt;br /&gt;
* Test AMR &amp;amp; non-AMR model &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
model			wav	amr&lt;br /&gt;
&lt;br /&gt;
xent baseline		4.47	&lt;br /&gt;
wav_mpe baseline        4.20	36.77&lt;br /&gt;
&lt;br /&gt;
amr_mpe_lr_1e-5		6.27	8.95&lt;br /&gt;
amr_mpe_lr_1e-4		7.58	8.68&lt;br /&gt;
&lt;br /&gt;
amr_xEnt_lr_1e-5	6.89	7.99&lt;br /&gt;
amr_xEnt_lr_1e-4	6.61	7.28&lt;br /&gt;
amr_xEnt_lr_0.08	5.72	6.20&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* Prepare to do adaptation on 1700h &lt;br /&gt;
* Prepare to do mixing xEnt test&lt;br /&gt;
&lt;br /&gt;
===GFbank===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* Finished the first round of gfbank training &amp;amp; test&lt;br /&gt;
* The same gmm model (mfcc feature) was used to get the alignment&lt;br /&gt;
* Traing fbank &amp;amp; gfbank based on the mfcc alignment&lt;br /&gt;
* Clean training and noise test&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
		clean	5dB	10dB	15dB	20dB	25dB&lt;br /&gt;
gfbank		4.22	73.03	39.20	16.41	8.36	5.60&lt;br /&gt;
gfbank_80	4.36	74.41	42.94	18.13	8.59	5.85&lt;br /&gt;
fbank_zmy	3.97	74.78	44.57	18.80	8.54	5.30&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* gfbank + fbank 80 dim training/test&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Engine optimization===&lt;br /&gt;
&lt;br /&gt;
* Investigating LOUDS FST. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Word to Vector==&lt;br /&gt;
&lt;br /&gt;
* Improved wordvector with multi sense&lt;br /&gt;
:* Almost impossible with the toolkit&lt;br /&gt;
:* Can think of pre-training vectors and then do clusering&lt;br /&gt;
&lt;br /&gt;
* WordVecteor-based keyword extraction&lt;br /&gt;
:* Prepared 7 category totally 500+ articles&lt;br /&gt;
:* A problem fixed to retrieve article words&lt;br /&gt;
&lt;br /&gt;
* Wordvector based on classification&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==LM development==&lt;br /&gt;
&lt;br /&gt;
===NN LM===&lt;br /&gt;
&lt;br /&gt;
* Character-based NNLM (6700 chars, 7gram), 500M data training done.&lt;br /&gt;
:* boundary-involved char NNLM training done&lt;br /&gt;
:* Test on rescoring&lt;br /&gt;
&lt;br /&gt;
* Investigate MS RNN LM training&lt;br /&gt;
&lt;br /&gt;
===3T Sogou LM===&lt;br /&gt;
&lt;br /&gt;
*3T + tencent LM combination:&lt;br /&gt;
:* Combine the 3T voc (11w) and the tencent 8w voca&lt;br /&gt;
:* re-segmentation&lt;br /&gt;
:* compute PPL with the 3T and tencent LM&lt;br /&gt;
:* compute the best mixing weights&lt;br /&gt;
:* the mixing weight is wrong ....&lt;br /&gt;
:* if we mix the two by equal weight (0.5/0.5), performance is better than the individual&lt;br /&gt;
&lt;br /&gt;
*3T + QA model combination&lt;br /&gt;
&lt;br /&gt;
==QA Matching==&lt;br /&gt;
&lt;br /&gt;
* FST-based matching&lt;br /&gt;
:* Investigating why openfST union does not lead to a determinizable graph&lt;br /&gt;
:* Test the pattern label&lt;br /&gt;
&lt;br /&gt;
* TF/IDF weight&lt;br /&gt;
:* code is done, TF/IDF weight can be used right now.&lt;br /&gt;
&lt;br /&gt;
==Embedded development==&lt;br /&gt;
&lt;br /&gt;
* CLG embedded decoder is almost done. Online compiler is on progress.&lt;br /&gt;
* English scoring is under go&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Speech QA==&lt;br /&gt;
&lt;br /&gt;
* N-best with entity LM was analyzed&lt;br /&gt;
:* WER vs QA accuracy is done&lt;br /&gt;
:* The figure shows that WER and QA accuracy is positively related&lt;br /&gt;
:* Addding song names and singer names improve performance in most cases&lt;br /&gt;
:* There indeed some exceptions in the figure that (a) higher WER does not reduce QA necessarily (b) adding entity names does not improve QA &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
*Class LM QA&lt;br /&gt;
* Use QA LM as the baseine&lt;br /&gt;
* Tag singer names and song names&lt;br /&gt;
* build tag LM&lt;br /&gt;
* Using graph integration to resolve the tags&lt;br /&gt;
* Adjusting in-tag weight&lt;br /&gt;
* Smaller weight produces more entity recognition&lt;br /&gt;
* Check if the recognized songs/singers are correct/wrong&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
1, non-merge&lt;br /&gt;
    BaseLine:&lt;br /&gt;
           qa-singer-song&lt;br /&gt;
    songs       41&lt;br /&gt;
    singers     23&lt;br /&gt;
&lt;br /&gt;
2, HCLG-merge&lt;br /&gt;
    Weight means the multiplier of the sub-graph entry. &lt;br /&gt;
  (1) LM:1e-5&lt;br /&gt;
    weight  0.00000001  0.0001   0.001  0.01    1   10&lt;br /&gt;
    songs      20        20      21    19       9    4&lt;br /&gt;
    singers    13        13      13    13       2    2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Cslt</name></author>	</entry>

	</feed>