|
|
(4位用户的18个中间修订版本未显示) |
第1行: |
第1行: |
− | =Task To Do=
| |
− | ==Speech Recognition==
| |
− | *End-to-End speech recognition
| |
− | :*Zhiyuan Tang/Mengyuan Zhao/Zhiyong Zhang
| |
| | | |
− | *Integrate the class information to HCLG fst for speech recognition
| + | =Tasks at hand= |
| | | |
− | *Distant speech recognition
| + | ==Speech Recognition== |
− | :*RNN-DAE: echo or reverberation
| + | |
− | ::*Xuewei Zhang/Zhiyuan Tang/Mengyuan Zhao/Zhiyong Zhang
| + | |
− | :*Reverberation
| + | |
− | ::*Mutli-microphones
| + | |
− | ::*(Lasso),Xuewei Zhang
| + | |
− | | + | |
− | *Voice conversation
| + | |
− | | + | |
− | *Sparse DNN
| + | |
− | :*Zhiyuan Tang
| + | |
− | | + | |
− | *Correlation based SENONE cluster
| + | |
− | | + | |
− | *NN Multi-GPU parallel traing
| + | |
− | :*Multi-Machine
| + | |
− | ::*Sheng Su
| + | |
− | :*Multi-GPU on one Machine
| + | |
− | ::*Sheng Su
| + | |
− | :* nnet3 code test
| + | |
− | | + | |
− | *Audio Embedding
| + | |
− | :*Ke Ning
| + | |
− | | + | |
− | *RNN training accelerating
| + | |
− | | + | |
− | *Data selection
| + | |
− | :*Zhiyong Zhang
| + | |
− | :*Sub-modular data selection
| + | |
− | :*Objective-function loss training self-adaptation
| + | |
− | | + | |
− | *Decoder
| + | |
− | :*Confidence output for task-required
| + | |
− | | + | |
− | | + | |
− | ==Speaker Verification== | + | |
− | *binary code
| + | |
− | :*Lantian Li
| + | |
− | | + | |
− | *RNN-ivector
| + | |
− | :*Lantian Li
| + | |
− | | + | |
− | *DNN clustering
| + | |
− | :*Lantian Li
| + | |
− | | + | |
− | =Task DONE=
| + | |
− | *Multi-Mode features based VAD
| + | |
− | :* Shi Yin
| + | |
− | | + | |
− | *DNN based Language identification and Speaker identification
| + | |
− | :* Xuewei Zhang/Zhiyuan Tang
| + | |
− | | + | |
− | *Neural network visulization
| + | |
− | :* Mian Wang,DONE
| + | |
| | | |
− | *Dark knowledge
| + | ===joint learning=== |
− | :* Mengyuan Zhao, Xiangyu Zeng, Zhiyong Zhang, Chao Liu
| + | * Hang Luo, Zhiyuan Tang |
| | | |
− | *Normal RNN speech recognition
| + | ===visualization=== |
− | :* Mengyuan Zhao
| + | * Ying Shi, Zhiyuan Tang |
| | | |
− | *Monmentum-like Hessien-Free acceleration
| + | ==Speaker Recognition== |
− | :* Zhiyong Zhang
| + | *Lantian Li, Yixiang Chen |
| | | |
− | *Activation value normalization through time --Batch Normalization
| |
− | :* Zhiyong Zhang
| |
| | | |
− | *Mix-training Balance decision tree
| + | =Tasks Done= |
− | :* Zhiyong Zhang
| + | |
| | | |
| + | =Technical Reports to write= |
| | | |
− | *20-h Chinese data-set release
| + | =Papers to write= |
− | :* Xuewei Zhang
| + | |
| | | |
− | *Unbound activation function(Rectifier/Maxout/Pnorm) go-through searching method
| + | =Patents to write= |
− | :* nne3 test --Xuewei Zhang
| + | |
| | | |
− | =Technical Report To Write= | + | =Patents done= |
− | 1, DNN-DAE based noise cancellation -- Xiangyu Zeng / Mengyuan Zhao / Zhiyong Zhang --DONE
| + | |
− | 2, Speech Rate DNN speech recognition --Shi Yin/Xiangyu Zeng --DONE
| + | |
− | 3, CNN+fbank feature combination --Mian Wang /Yiye Lin /Mengyuan Zhao /Shi Yin
| + | |
− | 4, Uyghur low-resource acoustic model enhancement -- Shi Yin / Mengyuan Zhao / Zhiyong Zhang --DONE
| + | |
− | 5, Uyghur 20h database release --Kaer /Shi Yin --DONE
| + | |
− | 6,Dark-Knowledge Transfer
| + | |
− | *: Xiangyu Zeng/ Mengyuan Zhao / Zhiyong Zhang
| + | |
| | | |
− | =Paper to Write= | + | =Projects= |
| | | |
− | =Project=
| |
− | * Xiaomi TV
| |
− | :*Mengyuan Zhao/Zhiyong Zhang
| |
− | :*TAG-lm & Domain-specific general lm
| |
| | | |
− | *Chinese-English mix-training
| + | ------------------------------ |
| + | [[task previous]] |