“ASR:2015-05-04”版本间的差异

2015年5月11日 (一) 00:49的最后版本

Speech Processing

AM development

Environment

grid-15 often does not work

RNN AM

details at http://liuc.cslt.org/pages/rnnam.html
Test monophone on RNN using dark-knowledge --Chao Liu
run using wsj,MPE --Chao Liu
run bi-directon --Chao Liu
modify code --Zhiyuan

Mic-Array

Change the prediction from fbank to spectrum features
investigate alpha parameter in time domian and frquency domain
ALPHA>=0, using data generated by reverber toolkit
consider theta
compute EER with kaldi

RNN-DAE(Deep based Auto-Encode-RNN)

HOLD --Zhiyong Zhang
http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=261

Speaker ID

DNN-based sid --Yiye Lin
http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=327

Ivector&Dvector based ASR

hold --Tian Lan

Cluster the speakers to speaker-classes, then using the distance or the posterior-probability as the metric
Direct using the dark-knowledge strategy to do the ivector training.
http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?step=view_request&cvssid=340
Ivector dimention is smaller, performance is better
Augument to hidden layer is better than input layer
train on wsj(testbase dev93+evl92)

Dark knowledge

Ensemble using 100h dataset to construct diffrernt structures -- Mengyuan

http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=264 --Zhiyong Zhang

adaptation for chinglish under investigation --Mengyuan Zhao

Try to improve the chinglish performance extremly

unsupervised training with wsj contributes to aurora4 model --Xiangyu Zeng

test large database with AMIDA

bilingual recognition

http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=359 --Zhiyuan Tang and Mengyuan

Text Processing

RNN LM

rnn

test the ppl and code the character-lm(hold)

lstm+rnn

check the lstm-rnnlm code about how to Initialize and update learning rate.(hold)

W2V based document classification

make a technical report about document classification using CNN --yiqiao
CNN adapt to resolve the low resource problem

Translation

similar-pair method in English word using translation model.

Sparse code in NLP

modify the objective function
sup-sampling method to solve the low frequence word
learn binary vector

online learning

using sampling method

relation classifier

majority of error in others class

@@ 第3行： / 第3行： @@
 ==== Environment ====
-* grid-11 often shut down automatically, too slow computation speed.
+* grid-15 often does not work
-* New grid-13 added, using gpu970
-* To update the wiki enviroment infomation
 ==== RNN AM====
 * details at http://liuc.cslt.org/pages/rnnam.html
-* Test monophone on RNN using dark-knowledge
+* Test monophone on RNN using dark-knowledge --Chao Liu
-* run using wsj,MPE
+* run using wsj,MPE  --Chao Liu
+* run bi-directon --Chao Liu
+* modify code --Zhiyuan
 ==== Mic-Array ====
@@ 第17行： / 第17行： @@
 * ALPHA>=0, using data generated by reverber toolkit
 * consider theta
+* compute EER with kaldi
 ====RNN-DAE(Deep based Auto-Encode-RNN)====
@@ 第27行： / 第28行： @@
 ===Ivector&Dvector based ASR===
+*  hold     --Tian Lan
 :* Cluster the speakers to speaker-classes, then using the distance or the posterior-probability as the metric
 :* Direct using the dark-knowledge strategy to do the ivector training.
@@ 第35行： / 第37行： @@
 ===Dark knowledge===
-:* Ensemble
+:* Ensemble using 100h dataset to construct diffrernt structures -- Mengyuan
 ::*http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=264 --Zhiyong Zhang
 :* adaptation for chinglish under investigation  --Mengyuan Zhao
@@ 第43行： / 第45行： @@
 ===bilingual recognition===
-:* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=359 --Zhiyuan Tang
+:* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=359 --Zhiyuan Tang and Mengyuan
 ==Text Processing==
-===tag LM===
-* similar word extension in FST
-:* will check the formula using Bayes and experiment
-:* add similarity weight
 ====RNN LM====
 *rnn
-:* test the ppl and code the character-lm
+:* test the ppl and code the character-lm(hold)
 *lstm+rnn
 :* check the lstm-rnnlm code about how to Initialize and update learning rate.(hold)
 ====W2V based document classification====
-* result about norm model [http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=lr&step=view_request&cvssid=355]
+* make a technical report about document classification using CNN --yiqiao
-* try CNN model
+* CNN adapt to resolve the low resource problem
 ===Translation===
-* v5.0 demo released
+* similar-pair method in English word using translation model.
-:* cut the dict and use new segment-tool
-===Sparse NN in NLP===
-* test the drop-out model and the performance gets a little improvement, need some result:
-* test the order feature ,need some result:
-* large dimension result:http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=lr&step=view_request&cvssid=344
-:* sparse-nn on 1000 dimension(le-6,0.705236) is better than 200 dimension(le-12,0.694678).
+===Sparse code in NLP===
+* modify the objective function
+* sup-sampling method to solve the low frequence word
+* learn binary vector
 ===online learning===
-* modified the listNet SGD
+* using sampling method
 ===relation classifier===
-* check the CNN code and contact the author of paper
+* majority of error in others class
+*

“ASR:2015-05-04”版本间的差异

2015年5月11日 (一) 00:49的最后版本

目录

Speech Processing

AM development

Environment

RNN AM

Mic-Array

RNN-DAE(Deep based Auto-Encode-RNN)

Speaker ID

Ivector&Dvector based ASR

Dark knowledge

bilingual recognition

Text Processing

RNN LM

W2V based document classification

Translation

Sparse code in NLP

online learning

relation classifier

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具