|
|
(4位用户的34个中间修订版本未显示) |
第1行: |
第1行: |
− | =Task To Do=
| |
− | *End-to-End speech recognition
| |
− | :*Zhiyuan Tang/Mengyuan Zhao/Zhiyong Zhang
| |
| | | |
− | *Real environment noise cancellation
| + | =Tasks at hand= |
− | :*(RNN-DAE: echo or reverberation)
| + | |
− | :*Mengyuan Zhao/Zhiyong Zhang
| + | |
| | | |
− | *Integrate the class information to HCLG fst for speech recognition
| + | ==Speech Recognition== |
| | | |
− | *Distant speech recognition
| + | ===joint learning=== |
− | :*(Reverberation, Mutli-microphones)*
| + | * Hang Luo, Zhiyuan Tang |
− | :*(Lasso),Xuewei Zhang
| + | |
| | | |
− | *Voice conversation
| + | ===visualization=== |
− | :*xx
| + | * Ying Shi, Zhiyuan Tang |
| | | |
− | *Unbound activation function(Rectifier/Maxout/Pnorm) go-through searching method
| + | ==Speaker Recognition== |
− | :*Zhiyong Zhang
| + | *Lantian Li, Yixiang Chen |
| | | |
− | *Sparse DNN
| |
− | :*Zhiyuan Tang
| |
| | | |
− | *DNN training GPU parallelization , nnet2 optimization
| + | =Tasks Done= |
| | | |
− | *Monmentum-like Hessien-Free acceleration
| + | =Technical Reports to write= |
| | | |
− | *Correlation based SEONE cluster
| + | =Papers to write= |
| | | |
− | *NN GPU parallel traing
| + | =Patents to write= |
− | :*Sheng Su
| + | |
| | | |
− | *Audio Embedding
| + | =Patents done= |
| | | |
− | *Activation value normalize through time
| + | =Projects= |
− | :* For bigger learning rate
| + | |
| | | |
− | *Mix-training Balance decision tree
| |
− | :* Zhiyong Zhang
| |
| | | |
− | *RNN training accelerating
| + | ------------------------------ |
− | | + | [[task previous]] |
− | *
| + | |
− | | + | |
− | | + | |
− | =Task DONE=
| + | |
− | *Multi-Mode features based VAD*
| + | |
− | :*Shi Yin, DONE
| + | |
− | | + | |
− | *DNN based Language identification and Speaker identification*
| + | |
− | :*Xuewei Zhang/Zhiyuan Tang
| + | |
− | | + | |
− | *Neural network visulization*
| + | |
− | :*Mian Wang,DONE
| + | |
− | | + | |
− | *Dark knowledge*
| + | |
− | :*Mengyuan Zhao, Xiangyu Zeng, Zhiyong Zhang, Chao Liu
| + | |
− | | + | |
− | *Normal RNN speech recognition*
| + | |
− | :*Mengyuan Zhao
| + | |
− | | + | |
− | =Technical Report To Write=
| + | |
− | 1, DNN-DAE based noise cancellation -- Xiangyu Zeng / Mengyuan Zhao / Zhiyong Zhang --DONE
| + | |
− | 2, Speech Rate DNN speech recognition --Shi Yin/Xiangyu Zeng --DONE
| + | |
− | 3, CNN+fbank feature combination --Mian Wang /Yiye Lin /Mengyuan Zhao /Shi Yin
| + | |
− | 4, Uyghur low-resource acoustic model enhancement -- Shi Yin / Mengyuan Zhao / Zhiyong Zhang --DONE
| + | |
− | 5, Uyghur 20h database release --Kaer /Shi Yin --DONE
| + | |
− | 6,Dark-Knowledge Transfer
| + | |
− | *: Xiangyu Zeng/ Mengyuan Zhao / Zhiyong Zhang
| + | |
− | | + | |
− | =Paper to Write=
| + | |
− | | + | |
− | =Project=
| + | |
− | * Xiaomi TV
| + | |
− | :*Mengyuan Zhao/Zhiyong Zhang
| + | |
− | :*TAG-lm & Domain-specific general lm
| + | |
− | | + | |
− | *Chinese-English mix-training
| + | |