2014-05-16

来自cslt Wiki
2014年5月16日 (五) 01:42Cslt讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索

Resoruce Building

  • Maxi onboard
  • Release management should be started: Zhiyong (+)
  • Blaster 0.1 & vivian 0.0 system release

Leftover questions

  • Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
  • Multi GPU training: Error encountered
  • Multilanguage training
  • Investigating LOUDS FST.
  • CLG embedded decoder plus online compiler.
  • DNN-GMM co-training

AM development

Sparse DNN

  • GA-based block sparsity (+++)
  • Found a paper in 2000 with similar ideas.
  • Try to get a student working on high performance computing to do the optimization

Noise training

  • More with-clean training completed. 2 conditions left

GFbank

  • 8k train
  • GFBank sinovoice 1400 MPE stream
  • 16k train
  • GFBank sinovoice 6000 MPE1 stream: worse than 1700h (10.18-11.11)


Multilingual ASR

  • Test sharing scheme:
  • decision tree share, xent improvement obtained, MPE no improvement (Chinese worse a bit, English a bit better).


English model

                             mic          tel
pure eng                    voxforge    fisher
chinese eng                 shujutang   convert-from-shujutang 

Denoising & Farfield ASR

  • Baseline: close-talk model decode far-field speech: 92.65
  • Will investigate DAE model.

Kaiser Window

window function test based on 23 Mel channel number  8k wsj databas	
window function	%WER	       ins	del     sub
kaiser	 278 / 5643=4.93	39	15	224
povey	 265 / 5643=4.70	34	14	217

window function test based on 30 Mel channel number  8k wsj databas
window function	%WER	        ins	del	sub
kaiser	 270 / 5643= 4.78	38	17	215
povey	283 / 5643= 5.02 	36	24	223

VAD

  • DNN-based VAD (24.77) shower better performance than energy based VAD (45.73)


Scoring

  • online scoring done??
  • checked into gitlab?

Word to Vector

  • Paper submitted

LM development

Domain specific LM

  • Prepare English lexicon

NN LM

  • Character-based NNLM (6700 chars, 7gram), 500M data training done.
  • Inconsistent pattern in WER were found on Tenent test sets
  • probably need to use another test set to do investigation.
  • Investigate MS RNN LM training

QA

FST-based matching

  • Word-based FST 1-2 seconds with 1600 patterns. Huilan's implementation <1 second.
  • THRAX toolkit for grammar to FST
  • Investigate determinization of G embedding
  • Refer to Kaldi new code