“2014-09-29”版本间的差异
来自cslt Wiki
(→Text Processing) |
|||
第72行: | 第72行: | ||
====Domain specific LM==== | ====Domain specific LM==== | ||
− | |||
h2. ngram generation is on going | h2. ngram generation is on going | ||
− | + | h2. look the memory and baidu_hi done | |
h2. NUM tag LM: | h2. NUM tag LM: | ||
− | + | * maxi work is released. | |
− | * | + | * yuanbin continue the tag lm work. |
+ | * add the ner to tag lm . | ||
* Boost specific words like wifi if TAG model does not work for a particular word. | * Boost specific words like wifi if TAG model does not work for a particular word. | ||
第88行: | 第88行: | ||
* Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM. | * Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM. | ||
* Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation | * Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation | ||
− | + | :* SSA-based local linear mapping still on running. | |
− | :* SSA-based local linear mapping still on running | + | :* k-means classes change to 2. |
* Knowledge vector started | * Knowledge vector started | ||
:* document obtained from wiki | :* document obtained from wiki | ||
+ | :* formula obtained | ||
* Character to word conversion | * Character to word conversion | ||
− | :* | + | :* read more paper . |
+ | :* prepare to train . | ||
+ | * Google word vector train | ||
+ | :* improve the sampling method | ||
===RNN LM=== | ===RNN LM=== | ||
第109行: | 第113行: | ||
* v3.0 demo released | * v3.0 demo released | ||
:* still slow | :* still slow | ||
+ | :* cut the vocabulary that is not important . | ||
===QA=== | ===QA=== | ||
− | * | + | * liangshan_v1 performance 74.3%. |
− | * | + | * New framework and GA method is done |
+ | * add SEMPRE tool to framework |
2014年9月29日 (一) 02:38的版本
Speech Processing
AM development
Sparse DNN
- Investigating layer-based DNN training
Noise training
- First draft of the noisy training journal paper
- Check abnormal behavior with large sigma (Yinshi, Liuchao)
Drop out & Rectification & convolutive network
- Drop out
- No performance improvement found yet.
- [1]
- Rectification
- Dropout NA problem was caused by large magnitude of weights
- Convolutive network
- Test more configurations
Denoising & Farfield ASR
- Lasso-based de-reverberation is done with the REVERBERATION toolkit
- Start to compose the experiment section for the SL paper.
VAD
- Noise model training done. Under testing.
- Need to investigate the performance reduction in babble noise. Call Jia.
Speech rate training
- Some interesting results with the simple speech rate change algorithm was obtained on the WSJ db
- Seems ROS model is superior to the normal one with faster speech
- Need to check distribution of ROS on WSJ
- Suggest to extract speech data of different ROS, construct a new test set
- Suggest to use Tencent training data
- Suggest to remove silence when compute ROS
Scoring
- Pitch & rythmn done.
- Harmonics hold
Confidence
- Basic confidence by using lattice-based posterior + DNN posterior + ROS done
- 23% detection error achieved by balanced model
Speaker ID
- GMM-based test program delivered
- Implementing GMM registration program
Emotion detection
- Sinovoice is implementing the server
Text Processing
LM development
Domain specific LM
h2. ngram generation is on going h2. look the memory and baidu_hi done
h2. NUM tag LM:
- maxi work is released.
- yuanbin continue the tag lm work.
- add the ner to tag lm .
- Boost specific words like wifi if TAG model does not work for a particular word.
Word2Vector
W2V based doc classification
- Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.
- Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation
- SSA-based local linear mapping still on running.
- k-means classes change to 2.
- Knowledge vector started
- document obtained from wiki
- formula obtained
- Character to word conversion
- read more paper .
- prepare to train .
- Google word vector train
- improve the sampling method
RNN LM
- Prepare WSJ database
- Trained model 10000 x 4 + 320 + 10000
- Better performance obtained (4.16-3.47)
- gigaword sampling for Chinese data
Translation
- v3.0 demo released
- still slow
- cut the vocabulary that is not important .
QA
- liangshan_v1 performance 74.3%.
- New framework and GA method is done
- add SEMPRE tool to framework