“2013-09-27”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以内容“== Data sharing == * LM count files still undelivered! == DNN progress == === Sparse DNN === * Optimal Brain Damage based sparsity is on going. Prepare the algorith...”创建新页面)
 
第10行: 第10行:
 
* An interesting investigation is drop-out 50% weights after each iteration, and then re-training without sticky.  
 
* An interesting investigation is drop-out 50% weights after each iteration, and then re-training without sticky.  
  
Report on [http://192.168.0.50:3000/series/?q=&action=view&series[]=91&series[]=91.0&series[]=91.1&series[]=91.2&series[]=91.3&series[]=91.4&series[]=91.5&series[]=91.6&series[]=91.7&series[]=91.8&series[]=91.9 graph]
+
Report on [http://192.168.0.50:3000/series/?q=&action=view&series[]=91&series[]=91.0&series[]=91.1&series[]=91.2&series[]=91.3&series[]=91.4&series[]=91.5&series[]=91.6&series[]=91.7&series[]=91.8&series[]=91.9|graph]
  
 
=== FBank features ===
 
=== FBank features ===

2013年9月27日 (五) 02:11的版本

Data sharing

  • LM count files still undelivered!

DNN progress

Sparse DNN

  • Optimal Brain Damage based sparsity is on going. Prepare the algorithm.
  • An interesting investigation is drop-out 50% weights after each iteration, and then re-training without sticky.

Report on [=91&series[]=91.0&series[]=91.1&series[]=91.2&series[]=91.3&series[]=91.4&series[]=91.5&series[]=91.6&series[]=91.7&series[]=91.8&series[]=91.9|graph]

FBank features

  • CMN shows similar impact to MFCC & FBank. Since MFCC involves summary of various random channels, the mean and covariance of the dimensions are less random. This leads to two possible impacts: first, the dimensions are relatively stable therefore CMVN does not contribute much; on other hand, estimation of mean and variance is more accurate so CMVN leads to more reliable results. This means CMVN leads to unpredictable performance improvement for MFCC & Fbank, depending on the data set.

Performance chart

  • Choose various Fbank dimension, keep LDA output dimension as 100. FB30 seems the best.

Performance chart

  • Choose FBank 40, test various LDA output dimension. The results show LDA is still helpful, and dimension 200 is sufficient.

Performance chart

  • We need to investigate non-linear discriminative approach which is simple but leads to less information lost.
  • We can also test a simple 'the same dimension DCT'. If the performance is still worse than FB, we confirm that the problem is due to noisy channel accumulation.
  • Need to investigate Gammatone filter banks. The same idea as FB, that we want to keep the information as much as possible. And it is possible to combine FB and GFB to pursue a better performance.

Tencent exps

N/A

DNN Confidence estimation

  • Lattice-based confidence show better performance with DNN with before.
  • Accumulated DNN confidence is done. The confidence values are much more reasonable.
  • Prepare MLP/DNN-based confidence integration.

Noisy training

  • We trained model with a random noise approach, which samples half of the training data and add 15db white noise. We hope this rand-noise learning will improve the performance of data in noise while keeping the discriminative power of the model in clean speech.

performance chat

  • The results are largely consistent with our expectation, that the performance on noisy data were greatly improved, while the performance on clean speech is not hurted much.
  • We are looking forward to the noisy training which introduces some noises randomly online in training.
  • Car noise training. It shows limited impact of car noise.

Performance chart