“2013-05-24”版本间的差异
来自cslt Wiki
第20行: | 第20行: | ||
|} | |} | ||
− | The conclusion is that with the L2 retrain, the DNN performance is largely called back. The extremely sparse case (th0.3) with sticky training seems quite amazing. | + | The conclusion is that with the L2 retrain, the DNN performance is largely called back. The extremely sparse case (th0.3) with sticky training seems quite amazing. This means the network could be sparse. However this is just for the 1900 test. Need test on other sets. |
* fixed-point DNN forwarding | * fixed-point DNN forwarding |
2013年6月1日 (六) 14:08的最后版本
目录
Data sharing
- LM count files still undelivered!
DNN progress
Experiments
- sparse DNN: sticky training (retrain the nnet while keeping the sparsness)
zero small values(test set: 1900):
threshold | 0 | 0.01 | 0.03 | 0.05 | 0.08 | 0.1 | 0.2 | 0.3 |
---|---|---|---|---|---|---|---|---|
shrinkage% | 0.0 | 4.3 | 12.7 | 20.9 | 32.5 | 39.5 | 66.4 | 81.6 |
without sticky: WER | 7.55 | 7.60 | 7.62 | 7.66 | 7.72 | 7.87 | 9.46 | 53.23 |
with sticky: WER | 7.55 | 7.57 | 7.60 | 7.60 | 7.63 | 7.64 | 8.35 | 9.51 |
The conclusion is that with the L2 retrain, the DNN performance is largely called back. The extremely sparse case (th0.3) with sticky training seems quite amazing. This means the network could be sparse. However this is just for the 1900 test. Need test on other sets.
- fixed-point DNN forwarding
According to the fixed-point FST and NN, and the results of the sparse NN, we are working on fast NN decoder which is suitable for embedded device. The work is just started.
Tencent exps
本周1000小时实验已结束,实验性能如下:
old baseline | new baseline | DNN | |
---|---|---|---|
1900 | 8.4 | 6.8 | 4.3 |
2044 | 22.4 | 15.7 | 12.7 |
online1 | 35.6 | 32.7 | 25.8 |
online2 | 29.6 | 27.3 | 22.1 |
map | 24.5 | 15.8 | 13.4 |
notepad | 16 | 8.1 | 5.6 |
general | 36 | 25.1 | 19.3 |
speedup | 26.8 | 14 | - |
接下来计划:
- 6000小时模型训练,dnn模型相关其他技术(序列化dt,alignment,pretrain)
GPU & CPU merge
- on progress.
Kaldi/HTK merge
- HTK2Kaldi: hold.
- Kaldi2HTK: hold and second priority
The above work is probably not very necessary since Tencent will fully migrate to the hybrid DNN approach, and therefore HTK will be never used.
Embedded progress
- Status:
- check the reference, and change the compiling options
- the large-scale AM training based on the Tencent 400h data is done.
- the random output problem is fixed.
Test Set | #utt | PS default | Tencent |
---|---|---|---|
cw | 993 | 8.01(RT: 0.07) | 7.61(RT: 0.40) |
hfc | 986 | 6.69(RT: 0.07) | 5.48(RT: 0.40) |
zz | 984 | 12.73(RT: 0.07) | 5.91(RT: 0.40) |
- To be done
- large scale parallel training.
- NN based engine(dynamic and static).