“Zhiyong Zhang”版本间的差异
来自cslt Wiki
(→Summary) |
|||
第1行: | 第1行: | ||
+ | |||
+ | =Task To Do= | ||
+ | * 1, RNN speech recognition (Tied-context-dependent-state and End-to-End) | ||
+ | * 2, Real environment noise cancellation(DNN-DAE/CNN-DAE/RNN-DAE: echo or reverberation) | ||
+ | * 3, Integrate the class information to HCLG fst for speech recognition | ||
+ | * 4, Multi-Mode features based VAD | ||
+ | * 5, DNN based Language identification and Speaker indentification | ||
+ | * 6, Distant speech recognition | ||
+ | * 7, Voice conversation | ||
+ | * 8, Unbound activation function(Rectifier/Maxout/Pnorm) go-through searching method. | ||
+ | * 9, Sparse DNN | ||
+ | * 10, Neural network visulization | ||
+ | |||
+ | |||
+ | =Technical Report To Write= | ||
+ | * 1, DNN-DAE based noise cancellation | ||
+ | * 2, Speech Rate DNN speech recognition | ||
+ | * 3, CNN+fbank feature combination | ||
+ | * 4, Uyghur low-resource acoustic model enhancement | ||
+ | * 5, Uyghur 20h database release | ||
+ | * 6, | ||
+ | |||
+ | =Papers To Read = | ||
+ | * 1, Learned-Norm pooling for deep feedforward and recurrent neural networks | ||
+ | |||
+ | |||
=Task schedules= | =Task schedules= | ||
第16行: | 第42行: | ||
* Testing on 100h-Ch+100h-En, better performance observed. | * Testing on 100h-Ch+100h-En, better performance observed. | ||
* Now testing the source code on 1400h_8k data, but stange decoding results got.Need to further investigate. | * Now testing the source code on 1400h_8k data, but stange decoding results got.Need to further investigate. | ||
− | |||
− | |||
− |
2015年1月12日 (一) 11:06的版本
目录
Task To Do
- 1, RNN speech recognition (Tied-context-dependent-state and End-to-End)
- 2, Real environment noise cancellation(DNN-DAE/CNN-DAE/RNN-DAE: echo or reverberation)
- 3, Integrate the class information to HCLG fst for speech recognition
- 4, Multi-Mode features based VAD
- 5, DNN based Language identification and Speaker indentification
- 6, Distant speech recognition
- 7, Voice conversation
- 8, Unbound activation function(Rectifier/Maxout/Pnorm) go-through searching method.
- 9, Sparse DNN
- 10, Neural network visulization
Technical Report To Write
- 1, DNN-DAE based noise cancellation
- 2, Speech Rate DNN speech recognition
- 3, CNN+fbank feature combination
- 4, Uyghur low-resource acoustic model enhancement
- 5, Uyghur 20h database release
- 6,
Papers To Read
- 1, Learned-Norm pooling for deep feedforward and recurrent neural networks
Task schedules
Summary
-------------------------------------------------------------------------------------------------------- Priority | Tasks name | Status | Notions -------------------------------------------------------------------------------------------------------- 1 | Bi-Softmax | ■■■□□□□□□□ | 1400h am training and problem fixing -------------------------------------------------------------------------------------------------------- 2 | RNN+DAE | □□□□□□□□□□ | --------------------------------------------------------------------------------------------------------
Speech Recognition
Multi-lingual Am training
Bi-Softmax
- Using two distinct softmax for English and Chinese data.
- Testing on 100h-Ch+100h-En, better performance observed.
- Now testing the source code on 1400h_8k data, but stange decoding results got.Need to further investigate.