“2013-05-24”版本间的差异

2013年5月31日 (五) 05:42的版本

Data sharing

LM count files still undelivered!

DNN progress

Experiments

sparse DNN: sticky training (retrain the nnet while keeping the sparsness)

zero small values(test set: 1900):

threshold	0	0.01	0.03	0.05	0.08	0.1	0.2	0.3
shrinkage%	0.0	4.3	12.7	20.9	32.5	39.5	66.4	81.6
without sticky: WER	7.55	7.60	7.62	7.66	7.72	7.87	9.46	53.23
with sticky: WER	7.55	7.57	7.60	7.60	7.63	7.64

The conclusion is that with the L2 retrain, the DNN performance is largely called back. Waiting for the results with extremely sparse networks.

fixed-point DNN forwarding

According to the fixed-point FST and NN, and the results of the sparse NN, we are working on fast NN decoder which is suitable for embedded device. The work is just started.

Tencent exps

本周1000小时实验已结束，实验性能如下：

Old Baseline New Baeline DNN 1900 8.4 6.8 4.3 2044 22.4 15.7 12.7 online1 35.6 32.7 25.8 online2 29.6 27.3 22.1 map 24.5 15.8 13.4 notepad 16 8.1 5.6 general 36 25.1 19.3 speedup 26.8 14

接下来计划： 6000小时模型训练，dnn模型相关其他技术（序列化dt，alignment，pretrain）

GPU & CPU merge

on progress.

Kaldi/HTK merge

HTK2Kaldi: hold.
Kaldi2HTK: hold and second priority

The above work is probably not very necessary since Tencent will fully migrate to the hybrid DNN approach, and therefore HTK will be never used.

Embedded progress

Status:

check the reference, and change the compiling options

the large-scale AM training based on the Tencent 400h data is done.

the random output problem is fixed.

Test Set	#utt	PS default	Tencent
cw	993	8.01(RT: 0.07)	7.61(RT: 0.40)
hfc	986	6.69(RT: 0.07)	5.48(RT: 0.40)
zz	984	12.73(RT: 0.07)	5.91(RT: 0.40)

To be done

large scale parallel training.
NN based engine(dynamic and static).

@@ 第53行： / 第53行： @@
 * Kaldi2HTK: hold and second priority
-The above work is probably not very necessary since Tencent will fully migrated to the hybrid DNN approach, and therefore HTK will be never used.
+The above work is probably not very necessary since Tencent will fully migrate to the hybrid DNN approach, and therefore HTK will be never used.
 == Embedded progress ==
 *Status:
 : check the reference, and change the compiling options
@@ 第62行： / 第63行： @@
 {| class="wikitable"
-! Test Set !! #utt !! CMU !! Tencent
+! Test Set !! #utt !! PS default !! Tencent
 |-
-|  cw  || 993 || 8.01(0.07) || 7.61(0.40)
+|  cw  || 993 || 8.01(RT: 0.07) || 7.61(RT: 0.40)
 |-
-|  hfc || 986 || 6.69(0.07) ||  5.48(0.40)
+|  hfc || 986 || 6.69(RT: 0.07) ||  5.48(RT: 0.40)
 |-
-|  zz  || 984 || 12.73(0.07) || 5.91(0.40)
+|  zz  || 984 || 12.73(RT: 0.07) || 5.91(RT: 0.40)
 |-
 |}