2014年1月20日 (一) 10:12的最后版本

DNN training

Environment setting

Accounts re-arrangement done on the SGE cluster. NO ROOT TO WORK.
Changed NFS server to 40 processes, hope to increase disk reading.
Agree to withdraw root/sudo privilege.
Agree to create a RAID-0 with another 3 3T disks

Corpora

Changed the data labeling strategy: gender and noise length will not be labelled for the following several corpora.
Automatic labeling

Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score held.
The first step is to investigate the raw accuracy on the domain-dependent test, and then decide if it is appropriate to use automatic labeling

Xiao Na will prepare 300h telephone speech data (Sinovoice recording). This will be used to improve the 8k model.

470 hour 8k training

MPE training done

Model	CE	MPE1	MPE2	MPE3	MPE4
4k states	23.27/22.85	21.35/18.87	21.18/18.76	21.07/18.54	20.93/18.32
8k states	22.16/22.22	20.55/18.03	20.36/17.94	20.32/17.78	20.29/17.80

6000 hour 16k training

Feature extraction done: solved several problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate.
Training has gone to tri4b, quick increase of states/pdfs.
DNN training will be started on Tuesday.

DNN Decoder

Sinovoice decoder: some errors in FST building. Many triphones were lost after C composing. Problems in cdgen?
Kaldi decoder:

A minor difference between CLG/HCLG results was found. Debugging into the problem.
CLG RT is comparable to the HCLG, roughly 0.3-0.4 in CSLT grid-2.
Additional optimization on pdf-pre-computing will be investigated.
Code deliver today.

@@ 第1行： / 第1行： @@
-=Project management=
-* Xiaoming and Xiao Na were added into the mail list
-* Potential Huawei conference-transcribing project was discussed
 =DNN training=
 ==Environment setting==
-* New disk space (3T) was created and mounted at /nfs/disk1
+* Accounts re-arrangement done on the SGE cluster. NO ROOT TO WORK.
-* Jobs with 100 threads work fine on the cluster
+* Changed NFS server to 40 processes, hope to increase disk reading.
+* Agree to withdraw root/sudo privilege.
+* Agree to create a RAID-0 with another 3 3T disks
 ==Corpora==
-* 60 hour data were cut this week
+* Changed the data labeling strategy: gender and noise length will not be labelled for the following several corpora.
-* Just send out to vendors for labeling
+* Automatic labeling
-* Waiting for out-source platform construction
+:* Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score held.
-* We assume 60 hour data per week in future
+:* The first step is to investigate the raw accuracy on the domain-dependent test, and then decide if it is appropriate to use automatic labeling
+* Xiao Na will prepare 300h telephone speech data (Sinovoice recording). This will be used to improve the 8k model.
 ==470 hour 8k training==
-* CE training done
+* MPE training done
-* MPE training partially done
 {| class="wikitable"
 ! Model !! CE !! MPE1!! MPE2 !! MPE3 !! MPE4
 |-
-|4k states||23.27/22.85 || 21.35/18.87 || 21.18/18.76 || 21.07/18.54
+|4k states ||23.27/22.85 || 21.35/18.87 || 21.18/18.76 || 21.07/18.54 || 20.93/18.32
 |-
-|8k states ||22.16/22.22 || - ||20.36/17.94 || - ||
+|8k states ||22.16/22.22 || 20.55/18.03 ||20.36/17.94  || 20.32/17.78 || 20.29/17.80
 |-
 |}
@@ 第33行： / 第31行： @@
 ==6000 hour 16k training==
-* Audio files ready. Files with incorrect sampling rates were removed
+* Feature extraction done: solved several problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate.
-* Lexicon and LM were ready
+* Training has gone to tri4b, quick increase of states/pdfs.
-* Making MFCC features
+* DNN training will be started on Tuesday.
-* Initial model (6 iterations etc) can be delivered before the spring holiday
 =DNN Decoder=
-* Initial trail of DNN decoder based on the Sinovoice code was failed, largely due to FST compiler
-* Change the strategy to an integrated approach: use the sinovoice system to control connections, and use Kaldi base for asr engine
+* Sinovoice decoder: some errors in FST building. Many triphones were lost after C composing. Problems in cdgen?
-* Xiaoming will do some investigation on the Sinovoice FST compiler, while Liu Chao will focus on the Kaldi-based decoder
+* Kaldi decoder:
+:* A minor difference between CLG/HCLG results was found. Debugging into the problem.
+:* CLG RT is comparable to the HCLG, roughly 0.3-0.4 in CSLT grid-2.
+:* Additional optimization on pdf-pre-computing will be investigated.
+:* Code deliver today.

“Sinovoice-2014-01-20”版本间的差异

2014年1月20日 (一) 10:12的最后版本

目录

DNN training

Environment setting

Corpora

470 hour 8k training

6000 hour 16k training

DNN Decoder

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具