“Sinovoice-2014-01-20”版本间的差异
来自cslt Wiki
(以内容“=Project management= * Xiaoming and Xiao Na were added into the mail list * Potential Huawei conference-transcribing project was discussed =DNN training= ==Environme...”创建新页面) |
|||
第1行: | 第1行: | ||
− | |||
− | |||
− | |||
− | |||
− | |||
=DNN training= | =DNN training= | ||
==Environment setting== | ==Environment setting== | ||
− | * | + | * Cluster accounts rearrangement |
− | * | + | * Withdraw root/sudo previelege |
+ | * Changed NFS server to 40 processes, hope to increase the disk reading speed | ||
+ | * Create a RAID-0 with 3 or 4 3T disks | ||
==Corpora== | ==Corpora== | ||
− | * | + | * Change the data labeling strategy: do not label gender and the length of noise in the rest of the corpora. |
− | * | + | * Automatic labeling |
− | * | + | :* Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score embedded. |
− | * | + | :* The first step is to investigate the raw accuracy on the domain-dependent test, and then decide the quality of automatic labeling |
==470 hour 8k training== | ==470 hour 8k training== | ||
− | + | * MPE training done | |
− | * MPE training | + | |
{| class="wikitable" | {| class="wikitable" | ||
! Model !! CE !! MPE1!! MPE2 !! MPE3 !! MPE4 | ! Model !! CE !! MPE1!! MPE2 !! MPE3 !! MPE4 | ||
|- | |- | ||
− | |4k states||23.27/22.85 || 21.35/18.87 || 21.18/18.76 || 21.07/18.54 | + | |4k states ||23.27/22.85 || 21.35/18.87 || 21.18/18.76 || 21.07/18.54 || 20.93/18.32 |
|- | |- | ||
− | |8k states ||22.16/22.22 || | + | |8k states ||22.16/22.22 || 20.55/18.03 ||20.36/17.94 || 20.32/17.78 || 20.29/17.80 |
|- | |- | ||
|} | |} | ||
第33行: | 第29行: | ||
==6000 hour 16k training== | ==6000 hour 16k training== | ||
− | * | + | * Feature extraction done: solved three problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate |
− | * | + | * Training goes to tri4b, quick increase of states/pdfs |
− | * | + | * DNN training could be started from Tuesday |
− | + | ||
=DNN Decoder= | =DNN Decoder= | ||
− | * | + | |
− | * | + | * Sinovoice decoder: some errors in FST building. Many triphones are lost after graph building. Problems in cdgen? |
− | * | + | * Kaldi decoder: |
+ | :* A minor difference between CLG/HCLG results was find. Debugging into the problem. | ||
+ | :* CLG RT is comparable to the HCLG RT, 0.3-0.4 in CSLT grid-2. | ||
+ | :* Additional optimization on pdf-pre-computing will be investigated. | ||
+ | :* Code deliver today. |
2014年1月20日 (一) 07:43的版本
目录
DNN training
Environment setting
- Cluster accounts rearrangement
- Withdraw root/sudo previelege
- Changed NFS server to 40 processes, hope to increase the disk reading speed
- Create a RAID-0 with 3 or 4 3T disks
Corpora
- Change the data labeling strategy: do not label gender and the length of noise in the rest of the corpora.
- Automatic labeling
- Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score embedded.
- The first step is to investigate the raw accuracy on the domain-dependent test, and then decide the quality of automatic labeling
470 hour 8k training
- MPE training done
Model | CE | MPE1 | MPE2 | MPE3 | MPE4 |
---|---|---|---|---|---|
4k states | 23.27/22.85 | 21.35/18.87 | 21.18/18.76 | 21.07/18.54 | 20.93/18.32 |
8k states | 22.16/22.22 | 20.55/18.03 | 20.36/17.94 | 20.32/17.78 | 20.29/17.80 |
6000 hour 16k training
- Feature extraction done: solved three problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate
- Training goes to tri4b, quick increase of states/pdfs
- DNN training could be started from Tuesday
DNN Decoder
- Sinovoice decoder: some errors in FST building. Many triphones are lost after graph building. Problems in cdgen?
- Kaldi decoder:
- A minor difference between CLG/HCLG results was find. Debugging into the problem.
- CLG RT is comparable to the HCLG RT, 0.3-0.4 in CSLT grid-2.
- Additional optimization on pdf-pre-computing will be investigated.
- Code deliver today.