“Sinovoice-2016-6-2”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
Embedding
Problem
 
(相同用户的8个中间修订版本未显示)
第17行: 第17行:
 
* PingAn
 
* PingAn
 
:*100h User data done
 
:*100h User data done
 
==Model Training==
 
===Deletion Error Problem===
 
* Add one noise phone to alleviate the silence over-training, looks OK.
 
* Omit sil accuracy in discriminative training
 
* H smoothing of XEnt and MPE, no significant affect.
 
* Add one silence arc from start-state to end-state
 
  
 
===Big-Model Training===
 
===Big-Model Training===
 
====16k====
 
====16k====
 
* Done!
 
* Done!
====8k=====
 
  
=====Project=====
+
 
 +
=====8k Project=====
 
* PingAn
 
* PingAn
  
第75行: 第68行:
  
 
* LiaoNingYiDong:
 
* LiaoNingYiDong:
  =========================================================================
+
:* Done.
  |    AM / config                    |    LNYD      |  LNYD re-tag  |
+
  -------------------------------------------------------------------------
+
  | tdnn 7-2048 xEnt                    |    21.51    |              |
+
  | tdnn 7-2048 MPE                    |    20.09    |              |
+
  | tdnn 7-2048 MPE adapt-LNYD          |    17.92    |    16.29    |
+
  -------------------------------------------------------------------------
+
  | tdnn 7-1024 xEnt                    |    21.72    |              |
+
  | tdnn 7-1024 MPE                    |    20.99    |              |
+
  | cnn 7-1024 xEnt 600.mdl            |    21.03    |              |
+
  | cnn 7-1024 MPE 12.mdl              |    19.80    |              |
+
  | cnn 7-1024 MPE adapt-LNYD 41.mdl    |    17.96    |    15.93    |
+
  -------------------------------------------------------------------------
+
  | spn 7-1024 xEnt                    |    21.70    |              |
+
  | spn 7-1024 MPE-1000H 23.mdl        |    19.97    |              |
+
  | spn 7-1024 MPE adapt-LNYD          |    18.67    |              |
+
  | spn cnn 7-1024 xEnt 300.mdl        |    22.26    |              |
+
  ========================================================================
+
  
 
===Embedding===
 
===Embedding===
第118行: 第94行:
  
 
===Problem===
 
===Problem===
* Pingan & Yueyu too much deletion error.
+
* Deletion error
:* TDNN deletion error rate > DNN deletion error rate
+
:* Solved by adding a noise phone(spn) to alleviate the silence over-training.
:* TDNN Silence scale is too sensitive for different test cases.
+
:* Need to re-train our current model.
* cmvn causes performance reduction.
+
* TDNN silence scale is too sensitive for different test cases.
 +
:* After adding "spn", maybe there's no need to tune silence-scale so carefully.
 +
* Stream-mode decoding(cmvn) causes performance reduction.
  
 
==SiaSun Robot==
 
==SiaSun Robot==

2016年6月2日 (四) 05:11的最后版本

Data

  • 16K LingYun
  • 2000h data ready
  • 4300h real-env data to label
  • YueYu
  • Total 250h(190h-YueYu + 60h-English)
  • Add 60h YueYu
  • CER: 75%->76%
  • WeiYu
  • 8k more data
  • 50h for training
  • 120h labeled ready
  • PingAn
  • 100h User data done

Big-Model Training

16k

  • Done!


8k Project
  • PingAn
 =========================================================================================
 |     AM / config                     |      all      |    KeHu wer   ||  KeHu no-ins  |
 -----------------------------------------------------------------------------------------
 | tdnn 7-2048 xEnt                    |     16.45     |     36.49     ||     25.18     |
 | tdnn 7-2048 MPE                     |     15.22     |     32.77     ||     23.48     |
 | tdnn 7-2048 MPE adapt-PABX          |     14.67     |     31.33     ||     22.76     |
 -----------------------------------------------------------------------------------------
 | tdnn 7-1024 xEnt                    |     16.60     |     35.91     ||     25.58     |
 | tdnn 7-1024 MPE 2e-6                |     15.67     |     32.77     ||     26.09     |
 | tdnn 7-1024 MPE 2e-5 1.mdl          |     15.54     |     32.77     ||     26.29     |
 | tdnn 7-1024 MPE 1e-5 4.mdl          |     15.76     |     33.55     ||     27.20     |
 | tdnn 7-1024 MPE adapt-PABX          |     14.80     |     30.48     ||     22.56     |
 -----------------------------------------------------------------------------------------
 | spn 7-1024 xEnt                     |     16.49     |     36.23     ||     24.59     |
 | spn 7-1024 xEnt xEnt-PA_user 101.mdl|     16.19     |     33.22     ||     22.69     |
 | spn 7-1024 xEnt xEnt-PA_user mpe    |     15.24     |     32.77     ||     21.65     |
 | spn 7-1024 MPE-1000H 23.mdl         |     15.29     |     33.09     ||     21.65     |
 | spn 7-1024 MPE adapt-PA_all 29.mdl  |     15.11     |     33.42     ||     21.84     |
 | spn 7-1024 MPE adapt-PA_user 2e-5   |     15.31     |     31.79     ||     20.14     |
 | spn 7-1024 MPE adapt-PA_user Hs 2e-5|     15.32     |     32.24     ||     20.93     |
 =========================================================================================
 =====================================================================================================
 |     LM / config                           |     KeHu      |KeHu check_zxm_recheck|  KeHu final   |
 -----------------------------------------------------------------------------------------------------
 |     baseline                              |     20.14     |         19.40        |     18.26     |
 -----------------------------------------------------------------------------------------------------
 |bank+baoxian.chart+word.w0.9               |       -       |         19.27        |     17.88     |
 |bank+baoxian+guojiadianwang.chart+word.w0.9|       -       |         19.02        |     17.94     |
 |bank+baoxian+guojiadianwang_w0.9           |       -       |         19.02        |       -       |
 |bank+baoxian_w0.9                          |       -       |         19.20        |       -       |
 |baoxian+bank_w0.9                          |       -       |         19.02        |       -       |
 |baoxian+user200h.chart+word.w0.9           |       -       |         19.20        |       -       |
 |baoxian+user200h_w0.8                      |       -       |         19.02        |       -       |
 |baoxian+user200h_w0.9                      |       -       |         19.08        |     18.01     |
 |baoxian+user200h.w0.9w0.9                  |       -       |         19.08        |       -       |
 |baoxian+user200h.chart1e-7.w0.9w0.1        |       -       |           -          |     23.82     |
 |baoxian+user200h.chart1e-7.w0.9w0.9        |       -       |           -          |     18.26     |
 =====================================================================================================


  • LiaoNingYiDong:
  • Done.

Embedding

  • Finish model training. performance/speed/size all looks ok.
  • Need further test on embedded device.

Character LM

  • Except Sogou-2T, 9-gram has been done.
  • Add word boundary tag to Character-LM trainig done
  • 9-gram
  • Except Weibo & Sogou-2T
  • 1e-7(13M) wer17.91 compared with 1e-7(no-boundary,71M) 13.4
  • 1e-8(54M) wer17.54
  • Prepare specific domain vocabulary
  • Dianxin/Baoxian/Dianli
  • DT lm training
  • ReFr
  • Merge Character-LM & word-LM
  • Union
  • Compose, success.
  • 2-step decoding: first, character-based LM. Then, word-based LM.

Problem

  • Deletion error
  • Solved by adding a noise phone(spn) to alleviate the silence over-training.
  • Need to re-train our current model.
  • TDNN silence scale is too sensitive for different test cases.
  • After adding "spn", maybe there's no need to tune silence-scale so carefully.
  • Stream-mode decoding(cmvn) causes performance reduction.

SiaSun Robot

  • Beam-forming algorithm test
  • NN-model based beam-forming

SID

Digit

  • Engine Package