“2014-09-22”版本间的差异
来自cslt Wiki
(→QA) |
|||
第114行: | 第114行: | ||
* v3.0 demo released | * v3.0 demo released | ||
+ | :* still slow | ||
==QA== | ==QA== | ||
− | * | + | * Huilan framework design done |
+ | * Investigate better framework |
2014年9月22日 (一) 02:24的最后版本
目录
Resoruce Building
Leftover questions
- Investigating LOUDS FST.
- CLG embedded decoder plus online compiler.
- DNN-GMM co-training
- NN LM
AM development
Sparse DNN
- Investigating layer-based DNN training
Noise training
- First draft of the noisy training journal paper
- Check abnormal behavior with large sigma (Yinshi, Liuchao)
Drop out & Rectification & convolutive network
- Drop out
- No performance improvement found yet.
- [1]
- Rectification
- Dropout NA problem was caused by large magnitude of weights
- Convolutive network
- Test more configurations
Denoising & Farfield ASR
- Lasso-based de-reverberation is done with the REVERBERATION toolkit
- Start to compose the experiment section for the SL paper.
VAD
- Noise model training done. Under testing.
- Need to investigate the performance reduction in babble noise. Call Jia.
Speech rate training
- Some interesting results with the simple speech rate change algorithm was obtained on the WSJ db
- Seems ROS model is superior to the normal one with faster speech
- Need to check distribution of ROS on WSJ
- Suggest to extract speech data of different ROS, construct a new test set
- Suggest to use Tencent training data
- Suggest to remove silence when compute ROS
Scoring
- Pitch & rythmn done.
- Harmonics hold
Confidence
- Basic confidence by using lattice-based posterior + DNN posterior + ROS done
- 23% detection error achieved by balanced model
LM development
Domain specific LM
h2. domain specific count dumped h2. ngram generation is on going
h2. NUM tag LM:
- HCLG union seems better than G union, when integrating grammar + LM (25->23)
- Boost specific words like wifi if TAG model does not work for a particular word.
Word2Vector
W2V based doc classification
- Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.
- Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation
- probably over-fitting with the MLP training
- SSA-based local linear mapping still on running
- Knowledge vector started
- document obtained from wiki
- Character to word conversion
- Design the transform model
RNN LM
- Prepare WSJ database
- Trained model 10000 x 4 + 320 + 10000
- Better performance obtained (4.16-3.47)
- gigaword sampling for Chinese data
Speaker ID
- Second model done
Emotion detection
- delivered to Sinovoice
Translation
- v3.0 demo released
- still slow
QA
- Huilan framework design done
- Investigate better framework