“ASR:2015-07-20”版本间的差异
来自cslt Wiki
(→Image Baseline) |
(→Speech Processing) |
||
(某位用户的一个中间修订版本未显示) | |||
第3行: | 第3行: | ||
==== Environment ==== | ==== Environment ==== | ||
− | * | + | * grid-14 is on reparation |
+ | * prepare to buy a server | ||
+ | |||
==== RNN AM==== | ==== RNN AM==== | ||
*hold | *hold | ||
*morpheme RNN --zhiyuan | *morpheme RNN --zhiyuan | ||
− | *train using large dataset--mengyuan | + | *train using 1400h large dataset--mengyuan |
==== Mic-Array ==== | ==== Mic-Array ==== | ||
第15行: | 第17行: | ||
====Data selection unsupervised learning | ====Data selection unsupervised learning | ||
+ | * hold | ||
* acoustic feature based submodular using Pinan dataset --zhiyong | * acoustic feature based submodular using Pinan dataset --zhiyong | ||
* write code to speed up --zhiyong | * write code to speed up --zhiyong | ||
第37行: | 第40行: | ||
===Dark knowledge=== | ===Dark knowledge=== | ||
+ | * hold | ||
* test random last output layer when train MPE --zhiyuan,mengyuan | * test random last output layer when train MPE --zhiyuan,mengyuan | ||
第42行: | 第46行: | ||
===language vector=== | ===language vector=== | ||
* train using language vector with the dataset of 1400h_CN + 100h_EN--mengyuan | * train using language vector with the dataset of 1400h_CN + 100h_EN--mengyuan | ||
+ | :* hold | ||
* write a paper--zhiyuan | * write a paper--zhiyuan | ||
===rectifier=== | ===rectifier=== | ||
* hold | * hold | ||
− | |||
− | |||
* rectifier RNN | * rectifier RNN | ||
+ | |||
+ | ===monophone=== | ||
+ | * triphone is tranfered to monophone | ||
==audio embedding=== | ==audio embedding=== | ||
第96行: | 第102行: | ||
:*Demo Release. | :*Demo Release. | ||
:*Paper Report. | :*Paper Report. | ||
− | *CNN Paper. | + | *Read CNN Paper. |
2015年7月22日 (三) 08:20的最后版本
目录
Speech Processing
AM development
Environment
- grid-14 is on reparation
- prepare to buy a server
RNN AM
- hold
- morpheme RNN --zhiyuan
- train using 1400h large dataset--mengyuan
Mic-Array
- hold
- compute EER with kaldi
====Data selection unsupervised learning
- hold
- acoustic feature based submodular using Pinan dataset --zhiyong
- write code to speed up --zhiyong
RNN-DAE(Deep based Auto-Encode-RNN)
- hold
- deliver to mengyuan
Speaker ID
- DNN-based sid --Lantian
Ivector&Dvector based ASR
- hold --Tian Lan
- Cluster the speakers to speaker-classes, then using the distance or the posterior-probability as the metric
- dark-konowlege using i-vector
- train on wsj(testbase dev93+evl92)
- --hold
Dark knowledge
- hold
- test random last output layer when train MPE --zhiyuan,mengyuan
language vector
- train using language vector with the dataset of 1400h_CN + 100h_EN--mengyuan
- hold
- write a paper--zhiyuan
rectifier
- hold
- rectifier RNN
monophone
- triphone is tranfered to monophone
audio embedding=
- audio ebedding --Wei Xu
Text Processing
RNN LM
- character-lm rnn(hold)
- lstm+rnn
- check the lstm-rnnlm code about how to Initialize and update learning rate.(hold)
Neural Based Document Classification
- (hold)
Order representation
- Nested Dropout
- semi-linear --> neural based auto-encoder.
- modify the objective function(hold)
Balance Representation
- Find error signal
Recommendation
- Reproduce baseline.
- LDA matrix dissovle.
- LDA (Text classification & Recommendation System) --> AAAI
DSSM based QA
- Demo Release.(English done.)
- Chinese Model start.
RNN based QA
- Read Source Code.
Seq to Seq(09-15)
- Review papers.(Reported in 07-08)
- Reproduce baseline.
Text Group Intern Project
Buddhist Process
(hold)
RNN Poem Process
- Read Paper & Source Code.
RNN Document Vector
(hold)
Image Baseline
- Demo Release.
- Paper Report.
- Read CNN Paper.