“ASR:2015-03-09”版本间的差异
来自cslt Wiki
(以“==Speech Processing == === AM development === ==== Environment ==== * grid-11 often shutdown automatically, too slow computation speed. * buy a new 800W power -- Xu...”为内容创建页面) |
(→Text Processing) |
||
(2位用户的8个中间修订版本未显示) | |||
第3行: | 第3行: | ||
==== Environment ==== | ==== Environment ==== | ||
− | * grid-11 often | + | * grid-11 often shut down automatically, too slow computation speed. |
− | * | + | * GPU has being repired.--Xuewei |
==== RNN AM==== | ==== RNN AM==== | ||
* details at http://liuc.cslt.org/pages/rnnam.html | * details at http://liuc.cslt.org/pages/rnnam.html | ||
− | * triphone one state based RNN? | + | * triphone one state based RNN ?--Liu Chao |
==== Mic-Array ==== | ==== Mic-Array ==== | ||
− | |||
* reproduce environment for interspeech | * reproduce environment for interspeech | ||
+ | * alpha parameter in Lasso | ||
====Dropout & Maxout & rectifier ==== | ====Dropout & Maxout & rectifier ==== | ||
* HOLD | * HOLD | ||
* Need to solve the too small learning-rate problem | * Need to solve the too small learning-rate problem | ||
− | * 20h small scale sparse dnn with rectifier. -- | + | * 20h small scale sparse dnn with rectifier. --Mengyuan |
* 20h small scale sparse dnn with Maxout/rectifier based on weight-magnitude-pruning. --Mengyuan Zhao | * 20h small scale sparse dnn with Maxout/rectifier based on weight-magnitude-pruning. --Mengyuan Zhao | ||
====Convolutive network==== | ====Convolutive network==== | ||
* Convolutive network(DAE) | * Convolutive network(DAE) | ||
− | + | * HOLD | |
:* Technical report writing, Mian Wang, Yiye Lin, Shi Yin, Mengyuan Zhao | :* Technical report writing, Mian Wang, Yiye Lin, Shi Yin, Mengyuan Zhao | ||
:* reproduce experiments -- Yiye | :* reproduce experiments -- Yiye | ||
第32行: | 第32行: | ||
====RNN-DAE(Deep based Auto-Encode-RNN)==== | ====RNN-DAE(Deep based Auto-Encode-RNN)==== | ||
− | * HOLD | + | * HOLD -Zhiyong |
* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=261 | * http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=261 | ||
− | |||
− | |||
− | |||
− | |||
====Speech rate training==== | ====Speech rate training==== | ||
:* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=268 | :* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=268 | ||
− | :* Technical report to draft. Xiangyu Zeng, Shi Yin | + | :* Technical report to draft.-- Xiangyu Zeng, Shi Yin |
:* Prepare for NCMSSC | :* Prepare for NCMSSC | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
====Neural network visulization==== | ====Neural network visulization==== | ||
* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=324 | * http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=324 | ||
− | * Technical report writing | + | * Technical report writing --Mian Wang. |
===Speaker ID=== | ===Speaker ID=== | ||
− | :* DNN-based sid | + | :* DNN-based sid --Yiye |
:* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=327 | :* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=327 | ||
− | + | ===Ivector based ASR=== | |
+ | :* Ivector dimention is smaller, performance is better | ||
+ | :* Augument to hidden layer is better than input layer | ||
==Text Processing== | ==Text Processing== | ||
第65行: | 第57行: | ||
====Domain specific LM==== | ====Domain specific LM==== | ||
* LM2.X | * LM2.X | ||
− | |||
:* train a large lm using 25w-dict.(hanzhenglong/wxx) | :* train a large lm using 25w-dict.(hanzhenglong/wxx) | ||
− | |||
− | |||
::* v2.0c filter the useless word.(next week) | ::* v2.0c filter the useless word.(next week) | ||
::* set the test set for new word (hold) | ::* set the test set for new word (hold) | ||
+ | :* prepare the wiki data: entity list. | ||
====tag LM==== | ====tag LM==== | ||
− | * Tag Lm | + | * Tag Lm(JT) |
− | :* | + | :* error check |
* similar word extension in FST | * similar word extension in FST | ||
− | :* | + | :* add the experiment to tag-lm paper. |
− | + | ||
====RNN LM==== | ====RNN LM==== | ||
*rnn | *rnn | ||
− | :* | + | :* the input and output is word embedding and add some token information like NER.. |
− | :* | + | :* map the word to character and train the lm. |
*lstm+rnn | *lstm+rnn | ||
:* check the lstm-rnnlm code about how to Initialize and update learning rate.(hold) | :* check the lstm-rnnlm code about how to Initialize and update learning rate.(hold) | ||
第89行: | 第78行: | ||
====W2V based doc classification==== | ====W2V based doc classification==== | ||
− | * data prepare. | + | * data prepare.(hold) |
====Knowledge vector==== | ====Knowledge vector==== | ||
− | * | + | * make a report on Monday |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
===Translation=== | ===Translation=== | ||
第104行: | 第88行: | ||
===Sparse NN in NLP=== | ===Sparse NN in NLP=== | ||
− | * | + | * prepare the ACL |
− | * | + | :* check the code to find the problem . |
− | + | :* increase the dimension | |
+ | :* use different test set. | ||
===QA=== | ===QA=== | ||
====improve fuzzy match==== | ====improve fuzzy match==== | ||
* add Synonyms similarity using MERT-4 method(hold) | * add Synonyms similarity using MERT-4 method(hold) | ||
− | |||
− | |||
− | |||
− | |||
− | |||
===online learning=== | ===online learning=== | ||
− | * | + | * data is ready.prepare the ACL paper |
− | ==== | + | :* prepare sougouQ data and test it using current online learning method |
− | * | + | ====framework==== |
− | :* | + | * extract the module |
− | :* | + | :* extract the context module ,search module,entity recognize module and common module. |
− | + | :* define the inference in different modules | |
− | + | * composite module | |
− | + | ||
+ | ====leftover problem==== | ||
* new inter will install SEMPRE | * new inter will install SEMPRE |
2015年3月16日 (一) 01:11的最后版本
Speech Processing
AM development
Environment
- grid-11 often shut down automatically, too slow computation speed.
- GPU has being repired.--Xuewei
RNN AM
- details at http://liuc.cslt.org/pages/rnnam.html
- triphone one state based RNN ?--Liu Chao
Mic-Array
- reproduce environment for interspeech
- alpha parameter in Lasso
Dropout & Maxout & rectifier
- HOLD
- Need to solve the too small learning-rate problem
- 20h small scale sparse dnn with rectifier. --Mengyuan
- 20h small scale sparse dnn with Maxout/rectifier based on weight-magnitude-pruning. --Mengyuan Zhao
Convolutive network
- Convolutive network(DAE)
- HOLD
- Technical report writing, Mian Wang, Yiye Lin, Shi Yin, Mengyuan Zhao
- reproduce experiments -- Yiye
DNN-DAE(Deep Auto-Encode-DNN)
- HOLD
- Technical report to draft, Xiangyu Zeng, Shi Yin, Mengyuan Zhao and Zhiyong Zhang,
- http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=318
RNN-DAE(Deep based Auto-Encode-RNN)
- HOLD -Zhiyong
- http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=261
Speech rate training
- http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=268
- Technical report to draft.-- Xiangyu Zeng, Shi Yin
- Prepare for NCMSSC
Neural network visulization
- http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=324
- Technical report writing --Mian Wang.
Speaker ID
Ivector based ASR
- Ivector dimention is smaller, performance is better
- Augument to hidden layer is better than input layer
Text Processing
LM development
Domain specific LM
- LM2.X
- train a large lm using 25w-dict.(hanzhenglong/wxx)
- v2.0c filter the useless word.(next week)
- set the test set for new word (hold)
- prepare the wiki data: entity list.
tag LM
- Tag Lm(JT)
- error check
- similar word extension in FST
- add the experiment to tag-lm paper.
RNN LM
- rnn
- the input and output is word embedding and add some token information like NER..
- map the word to character and train the lm.
- lstm+rnn
- check the lstm-rnnlm code about how to Initialize and update learning rate.(hold)
Word2Vector
W2V based doc classification
- data prepare.(hold)
Knowledge vector
- make a report on Monday
Translation
- v5.0 demo released
- cut the dict and use new segment-tool
Sparse NN in NLP
- prepare the ACL
- check the code to find the problem .
- increase the dimension
- use different test set.
QA
improve fuzzy match
- add Synonyms similarity using MERT-4 method(hold)
online learning
- data is ready.prepare the ACL paper
- prepare sougouQ data and test it using current online learning method
framework
- extract the module
- extract the context module ,search module,entity recognize module and common module.
- define the inference in different modules
- composite module
leftover problem
- new inter will install SEMPRE