“Sinovoice-2014-01-20”版本间的差异
来自cslt Wiki
第3行: | 第3行: | ||
==Environment setting== | ==Environment setting== | ||
− | * | + | * Accounts re-arrangement done on the SGE cluster. NO ROOT TO WORK. |
− | + | * Changed NFS server to 40 processes, hope to increase disk reading. | |
− | * Changed NFS server to 40 processes, hope to increase | + | * Agree to withdraw root/sudo privilege. |
− | * | + | * Agree to create a RAID-0 with another 3 3T disks |
==Corpora== | ==Corpora== | ||
− | * | + | * Changed the data labeling strategy: gender and noise length will not be labelled for the following several corpora. |
* Automatic labeling | * Automatic labeling | ||
− | :* Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score | + | :* Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score held. |
− | :* The first step is to investigate the raw accuracy on the domain-dependent test, and then decide | + | :* The first step is to investigate the raw accuracy on the domain-dependent test, and then decide if it is appropriate to use automatic labeling |
− | * Xiao Na prepare 300h telephone data (Sinovoice recording) to improve the 8k model. | + | * Xiao Na will prepare 300h telephone speech data (Sinovoice recording). This will be used to improve the 8k model. |
第31行: | 第31行: | ||
==6000 hour 16k training== | ==6000 hour 16k training== | ||
− | * Feature extraction done: solved | + | * Feature extraction done: solved several problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate. |
− | * Training | + | * Training has gone to tri4b, quick increase of states/pdfs. |
− | * DNN training | + | * DNN training will be started on Tuesday. |
=DNN Decoder= | =DNN Decoder= | ||
− | * Sinovoice decoder: some errors in FST building. Many triphones | + | * Sinovoice decoder: some errors in FST building. Many triphones were lost after C composing. Problems in cdgen? |
* Kaldi decoder: | * Kaldi decoder: | ||
− | :* A minor difference between CLG/HCLG results was | + | :* A minor difference between CLG/HCLG results was found. Debugging into the problem. |
− | :* CLG RT is comparable to the HCLG | + | :* CLG RT is comparable to the HCLG, roughly 0.3-0.4 in CSLT grid-2. |
:* Additional optimization on pdf-pre-computing will be investigated. | :* Additional optimization on pdf-pre-computing will be investigated. | ||
:* Code deliver today. | :* Code deliver today. |
2014年1月20日 (一) 10:12的最后版本
目录
DNN training
Environment setting
- Accounts re-arrangement done on the SGE cluster. NO ROOT TO WORK.
- Changed NFS server to 40 processes, hope to increase disk reading.
- Agree to withdraw root/sudo privilege.
- Agree to create a RAID-0 with another 3 3T disks
Corpora
- Changed the data labeling strategy: gender and noise length will not be labelled for the following several corpora.
- Automatic labeling
- Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score held.
- The first step is to investigate the raw accuracy on the domain-dependent test, and then decide if it is appropriate to use automatic labeling
- Xiao Na will prepare 300h telephone speech data (Sinovoice recording). This will be used to improve the 8k model.
470 hour 8k training
- MPE training done
Model | CE | MPE1 | MPE2 | MPE3 | MPE4 |
---|---|---|---|---|---|
4k states | 23.27/22.85 | 21.35/18.87 | 21.18/18.76 | 21.07/18.54 | 20.93/18.32 |
8k states | 22.16/22.22 | 20.55/18.03 | 20.36/17.94 | 20.32/17.78 | 20.29/17.80 |
6000 hour 16k training
- Feature extraction done: solved several problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate.
- Training has gone to tri4b, quick increase of states/pdfs.
- DNN training will be started on Tuesday.
DNN Decoder
- Sinovoice decoder: some errors in FST building. Many triphones were lost after C composing. Problems in cdgen?
- Kaldi decoder:
- A minor difference between CLG/HCLG results was found. Debugging into the problem.
- CLG RT is comparable to the HCLG, roughly 0.3-0.4 in CSLT grid-2.
- Additional optimization on pdf-pre-computing will be investigated.
- Code deliver today.