“2014-10-13”版本间的差异
来自cslt Wiki
(以“==Speech Processing == === AM development === ==== Sparse DNN ==== * Performance improvement found when pruned slightly * Experiments show that * Suggest to use TI...”为内容创建页面) |
(→Text Processing) |
||
第110行: | 第110行: | ||
* Knowledge vector started | * Knowledge vector started | ||
− | :* | + | :* format the data |
− | + | ||
* Character to word conversion | * Character to word conversion | ||
− | :* | + | :* prepare the task: word similarity |
− | :* prepare | + | :* prepare the dict. |
* Google word vector train | * Google word vector train | ||
第121行: | 第120行: | ||
===RNN LM=== | ===RNN LM=== | ||
− | + | *rnn | |
− | * | + | *lstm+rnn |
− | * | + | : install the tool and prepare the data of wsj |
− | + | : prepare the baseline. | |
− | + | ||
− | + | ||
===Translation=== | ===Translation=== | ||
* v3.0 demo released | * v3.0 demo released | ||
:* still slow | :* still slow | ||
− | :* | + | :* re-segment the word using new dictionary. |
+ | :* check new data. | ||
===QA=== | ===QA=== | ||
− | * | + | * search method: |
− | * | + | :* add the vsm and BM25 to improve the search. and the strategy of selecting the answer |
− | * | + | :* segment the word using minimum granularity for lucene index and bag-of-words method. |
+ | * new inter will install SEMPRE |
2014年10月13日 (一) 07:41的版本
Speech Processing
AM development
Sparse DNN
- Performance improvement found when pruned slightly
- Experiments show that
- Suggest to use TIMIT / AURORA 4 for training
Noise training
- First draft of the noisy training journal paper
- Check abnormal behavior with large sigma (Yinshi, Liuchao)
Drop out & Rectification & convolutive network
- Drop out
- No performance improvement found yet.
- [1]
- Rectification
- Dropout NA problem was caused by large magnitude of weights
- Convolutive network
- Test more configurations
- Zhiyong will work on CNN
- Recurrent neural network
- investigate CURRENNT for AM
Denoising & Farfield ASR
- Lasso-based de-reverberation is done with the REVERBERATION toolkit
- Start to compose the experiment section for the SL paper.
VAD
- problems found at the beginning part of speech (0-0.02s?)
- Noise model training done. Under testing.
- Need to investigate the performance reduction in babble noise. Call Jia.
Speech rate training
- Some interesting results with the simple speech rate change algorithm was obtained on the WSJ db
- Seems ROS model is superior to the normal one with faster speech
- Need to check distribution of ROS on WSJ
- Suggest to extract speech data of different ROS, construct a new test set
- Suggest to use Tencent training data
- Suggest to remove silence when compute ROS
low resource language AM training
- Results on CVSS[3]
- Use Chinese NN as initial NN, change the last layer
Scoring
- global scoring done.
- Pitch & rhythm done, need testing
- Harmonics hold
Confidence
- experiments done, need more data
- Basic confidence by using lattice-based posterior + DNN posterior + ROS done
- 23% detection error achieved by balanced model
Speaker ID
- Add VAD to system
- GMM-based test program delivered
- GMM registration program done
Emotion detection
- Zhang Weiwei is learning the code
- Sinovoice is implementing the server
Text Processing
LM development
Domain specific LM
h2. ngram generation is on going h2. look the memory and baidu_hi done
h2. NUM tag LM:
- maxi work is released.
- yuanbin continue the tag lm work.
- add the ner to tag lm .
- Boost specific words like wifi if TAG model does not work for a particular word.
Word2Vector
W2V based doc classification
- Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.
- Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation
- SSA-based local linear mapping still on running.
- k-means classes change to 2.
- Knowledge vector started
- format the data
- Character to word conversion
- prepare the task: word similarity
- prepare the dict.
- Google word vector train
- improve the sampling method
RNN LM
- rnn
- lstm+rnn
- install the tool and prepare the data of wsj
- prepare the baseline.
Translation
- v3.0 demo released
- still slow
- re-segment the word using new dictionary.
- check new data.
QA
- search method:
- add the vsm and BM25 to improve the search. and the strategy of selecting the answer
- segment the word using minimum granularity for lucene index and bag-of-words method.
- new inter will install SEMPRE