2013-11-15

Data sharing

LM count files still undelivered!

AM development

Sparse DNN

Optimal Brain Damage(OBD).

Basic OBD done, with the ICASSP paper submitted.
Online OBD running

Try 3 configurations: batch size=256, 13000 (10 prunings), whole data.
The current results show that the the performance follows the order: Acc(whole data) > Acc(256) > Acc(13000).
Investigate some in-the-middle update, e.g., update twice for each iteration.

Noisy training

An ICASSP paper submitted.

Simulated Annealing training.

Rejected with small noises.
Using just the clean speech, it still rejected. This a bit strange.

Noise concentrated training

Using pure noise (no silence, narrow SNR band). Most of the results are expected.
Need to check the case with car-noise 20/25 db training and white noise 20 db test.

Noise-adding modification

Need to re-implement the noise-adding. Make it before the fbank computation.

Tencent exps

N/A

LM development

NN LM

Results show better performance with NN rescoring.

                  2044      map    notetp3   record1900  general  online1  online2 speedup
scal=  0.5	28.69	34.52	20.56	   14.53	 45.52	41.3	34.48	33.53
scal = 0.6	28.3	34.28	20.67	   14.05	 45.34	40.73	33.81	32.71
scal = 0.7	27.84	33.81	20.18	   13.74	 45.13	40.29	33.17	31.86
scal = 0.8	27.58	33.87	19.16	   13.53	 44.92	  40	32.82	31.74
scal = 0.9	27.86	33.92	19.05	   13.41	 44.9	39.65	32.5	31.89
scal = 0.95	27.79	34.07	19.05	   13.56	 44.83	39.76	32.41	31.68
scal = 0.96	27.9	34.1	18.83	   13.53	 44.83	39.79	32.43	31.68
scal = 0.97	27.94	34.15	18.83	   13.47	 44.82	39.78	32.44	31.89
scal = 0.99	28.02	34.2	19	   13.49	 44.86	39.82	32.47	32.01

QA LM

The QA model training done. Test on the Sogou Q text.

Data	lexicon	size	size2	PPL	PPL2
Q (10G)	15w	1.5G	800M	301.64	317.19
QA(100G)	11w	4.5G	1G	287.134	315.695
QA(100G)	8w8	4.5G	1G	559.029	626.146

2013-11-15

目录

Data sharing

AM development

Sparse DNN

Noisy training

Tencent exps

LM development

NN LM

QA LM

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具