“ASR:2015-03-09”版本间的差异
来自cslt Wiki
(→Speech Processing) |
|||
第3行: | 第3行: | ||
==== Environment ==== | ==== Environment ==== | ||
− | * grid-11 often | + | * grid-11 often shut down automatically, too slow computation speed. |
− | * | + | * GPU has being repired.--Xuewei |
==== RNN AM==== | ==== RNN AM==== | ||
* details at http://liuc.cslt.org/pages/rnnam.html | * details at http://liuc.cslt.org/pages/rnnam.html | ||
− | * triphone one state based RNN? | + | * triphone one state based RNN ?--Liu Chao |
==== Mic-Array ==== | ==== Mic-Array ==== | ||
− | |||
* reproduce environment for interspeech | * reproduce environment for interspeech | ||
+ | * alpha parameter in Lasso | ||
====Dropout & Maxout & rectifier ==== | ====Dropout & Maxout & rectifier ==== | ||
* HOLD | * HOLD | ||
* Need to solve the too small learning-rate problem | * Need to solve the too small learning-rate problem | ||
− | * 20h small scale sparse dnn with rectifier. -- | + | * 20h small scale sparse dnn with rectifier. --Mengyuan |
* 20h small scale sparse dnn with Maxout/rectifier based on weight-magnitude-pruning. --Mengyuan Zhao | * 20h small scale sparse dnn with Maxout/rectifier based on weight-magnitude-pruning. --Mengyuan Zhao | ||
====Convolutive network==== | ====Convolutive network==== | ||
* Convolutive network(DAE) | * Convolutive network(DAE) | ||
− | + | * HOLD | |
:* Technical report writing, Mian Wang, Yiye Lin, Shi Yin, Mengyuan Zhao | :* Technical report writing, Mian Wang, Yiye Lin, Shi Yin, Mengyuan Zhao | ||
:* reproduce experiments -- Yiye | :* reproduce experiments -- Yiye | ||
第32行: | 第32行: | ||
====RNN-DAE(Deep based Auto-Encode-RNN)==== | ====RNN-DAE(Deep based Auto-Encode-RNN)==== | ||
− | * HOLD | + | * HOLD -Zhiyong |
* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=261 | * http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=261 | ||
− | |||
− | |||
− | |||
− | |||
====Speech rate training==== | ====Speech rate training==== | ||
:* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=268 | :* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=268 | ||
− | :* Technical report to draft. Xiangyu Zeng, Shi Yin | + | :* Technical report to draft.-- Xiangyu Zeng, Shi Yin |
:* Prepare for NCMSSC | :* Prepare for NCMSSC | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
====Neural network visulization==== | ====Neural network visulization==== | ||
* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=324 | * http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=324 | ||
− | * Technical report writing | + | * Technical report writing --Mian Wang. |
===Speaker ID=== | ===Speaker ID=== | ||
− | :* DNN-based sid | + | :* DNN-based sid --Yiye |
:* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=327 | :* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=327 | ||
+ | |||
+ | ===Ivector based ASR=== | ||
+ | :* Ivector dimention is smaller, performance is better | ||
+ | :* Augument to hidden layer is better than input layer | ||
==Text Processing== | ==Text Processing== |
2015年3月11日 (三) 07:26的版本
Speech Processing
AM development
Environment
- grid-11 often shut down automatically, too slow computation speed.
- GPU has being repired.--Xuewei
RNN AM
- details at http://liuc.cslt.org/pages/rnnam.html
- triphone one state based RNN ?--Liu Chao
Mic-Array
- reproduce environment for interspeech
- alpha parameter in Lasso
Dropout & Maxout & rectifier
- HOLD
- Need to solve the too small learning-rate problem
- 20h small scale sparse dnn with rectifier. --Mengyuan
- 20h small scale sparse dnn with Maxout/rectifier based on weight-magnitude-pruning. --Mengyuan Zhao
Convolutive network
- Convolutive network(DAE)
- HOLD
- Technical report writing, Mian Wang, Yiye Lin, Shi Yin, Mengyuan Zhao
- reproduce experiments -- Yiye
DNN-DAE(Deep Auto-Encode-DNN)
- HOLD
- Technical report to draft, Xiangyu Zeng, Shi Yin, Mengyuan Zhao and Zhiyong Zhang,
- http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=318
RNN-DAE(Deep based Auto-Encode-RNN)
- HOLD -Zhiyong
- http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=261
Speech rate training
- http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=268
- Technical report to draft.-- Xiangyu Zeng, Shi Yin
- Prepare for NCMSSC
Neural network visulization
- http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=324
- Technical report writing --Mian Wang.
Speaker ID
Ivector based ASR
- Ivector dimention is smaller, performance is better
- Augument to hidden layer is better than input layer
Text Processing
LM development
Domain specific LM
- LM2.X
- mix the sougou2T-lm,kn-discount(done)
- train a large lm using 25w-dict.(hanzhenglong/wxx)
- v2.0c filter the useless word.(next week)
- set the test set for new word (hold)
- prepare the wiki data: entity list.
tag LM
- Tag Lm(JT)
- error check
- similar word extension in FST
- add the experiment to tag-lm paper.
RNN LM
- rnn
- discuss the rnn-lstm lm
- lstm+rnn
- check the lstm-rnnlm code about how to Initialize and update learning rate.(hold)
Word2Vector
W2V based doc classification
- data prepare.(hold)
Knowledge vector
- make a report on Monday
Translation
- v5.0 demo released
- cut the dict and use new segment-tool
Sparse NN in NLP
- prepare the ACL
QA
improve fuzzy match
- add Synonyms similarity using MERT-4 method(hold)
online learning
- data is ready.prepare the ACL paper
context framework
- code for demo
- demo is done
- new inter will install SEMPRE