Xinsong-beamforming-result
来自cslt Wiki
far field SNR : 14dB restaurant training data reorded in 2016.4.22 has 250 sentences and test data has 11 sentences. 16k and 16bit data
AM: 10000h 7*2048 MPE: LM: 1e-7 5-gram
ID3: test data which are involved in the DAE training
baseline: --------------------------------------------------- | test random wer | test ID3 wer(within trainingset) --------------------------------------------------- c1 | 39.74 | 36.89 --------------------------------------------------- c2 | 30.77 | 34.95 --------------------------------------------------- c3 | 34.62 | 37.38 --------------------------------------------------- c4 | 42.31 | 36.89 --------------------------------------------------- near | 21.79 | 4.37 --------------------------------------------------- beamforming: --------------------------------------------------- DS_post | 26.92 | 29.13 ---------------------------------------------------- SD_post | 23.08 | 26.70 (second) ---------------------------------------------------- MVDR_post | 26.92 | 28.64 ---------------------------------------------------- sino beamforming| _from_xiaoming | 26.92 | 31.07 ---------------------------------------------------- Four-channel cnn_tdnn model: 160 fbank ---------------------------------------------------- tdnn_dae_1*1024| 38.46 | 10.19 _tdnn_lr_0.008 | ----------------------------------------------------- cnn_tdnn_daa_ | 1*128_cnn_1*1024| 39.74 | 5.34 _tdnn_lr_0.008 | ----------------------------------------------------- cnn_tdnn_dae_ | 1*128_cnn_1*512 | 48.72 | 7.28 _tdnn_lr_0.008 | ----------------------------------------------------- cnn_tdnn_dae_ | 1*64_cnn_1*1024 | 39.74 | 7.28 _tdnn_lr_0.008 | ----------------------------------------------------- cnn_tdnn_dae_ | 2*64_cnn_1*1024 | 50.00 | 25.24 _tdnn_lr_0.008 | ----------------------------------------------------- cnn_tdnn_dae_ | 2*64_cnn_1*1024 | 44.87 | 23.30 _tdnn_lr_0.008_ | nopooling | ----------------------------------------------------- cnn_tdnn_dae_ | 1*32_cnn_1*1024 | 50.00 | 8.25 _tdnn_lr_0.008 | ------------------------------------------------------ cnn_tdnn_dae_ | 2*32_cnn_1*1024 | 46.15 | 33.98 _tdnn_lr_0.008 | ------------------------------------------------------ cnn_tdnn_dae_ | 2*32_cnn_1*1024 | 47.44 | 28.16 _tdnn_lr_0.008_ | nopooling | ------------------------------------------------------ cnn_tdnn_dae_ | 1*16_cnn_1*1024 | 46.15 | 8.25 _tdnn_lr_0.008 | ------------------------------------------------------- cnn_tdnn_dae_ | 2*16_cnn_1*1024 | 62.82 | 17.96 _tdnn_lr_0.008 | ------------------------------------------------- cnn_tdnn_dae_ | 2*16_cnn_1*1024 | 50.00 | 22.82 _tdnn_lr_0.008_ | nopooling | ------------------------------------------------- cnn_tdnn_dae_ | 2*16_cnn_1*256 | 48.72 | 16.99 _tdnn_lr_0.008 | ------------------------------------------------- cnn_tdnn_dae_ | 2*16_cnn_1*256 | 48.72 | 19.90 _tdnn_lr_0.008_ | nopooling | ------------------------------------------------- Single channel cnn_tdnn: 40 Fbanks ------------------------------------------------------ cnn_tdnn_mix_dae_| 1*128_cnn_1*1024 | 25.64 | 9.22 _tdnn_lr_0.008_c1 | ------------------------------------------------------ cnn_tdnn_mix_dae_| 1*128_cnn_1*1024 | 28.21 | 7.28 _tdnn_lr_0.008_c2 | ------------------------------------------------------ cnn_tdnn_mix_dae_ | 1*128_cnn_1*1024 | 20.51 | 4.37 (first) _tdnn_lr_0.008_c3 | ------------------------------------------------------ cnn_tdnn_mix_dae_ | 1*128_cnn_1*1024 | 21.79 | 5.34 _tdnn_lr_0.008_c4 | -------------------------------------------------------