“2014-09-22”版本间的差异

2014年9月22日 (一) 02:24的最后版本

Resoruce Building

Leftover questions

Investigating LOUDS FST.
CLG embedded decoder plus online compiler.
DNN-GMM co-training
NN LM

AM development

Sparse DNN

Investigating layer-based DNN training

Noise training

First draft of the noisy training journal paper
Check abnormal behavior with large sigma (Yinshi, Liuchao)

Drop out & Rectification & convolutive network

Drop out

No performance improvement found yet.
[1]

Rectification

Dropout NA problem was caused by large magnitude of weights

Convolutive network

Test more configurations

Denoising & Farfield ASR

Lasso-based de-reverberation is done with the REVERBERATION toolkit

Start to compose the experiment section for the SL paper.

VAD

Noise model training done. Under testing.
Need to investigate the performance reduction in babble noise. Call Jia.

Speech rate training

Some interesting results with the simple speech rate change algorithm was obtained on the WSJ db

[2]

Seems ROS model is superior to the normal one with faster speech
Need to check distribution of ROS on WSJ
Suggest to extract speech data of different ROS, construct a new test set
Suggest to use Tencent training data
Suggest to remove silence when compute ROS

Scoring

Pitch & rythmn done.
Harmonics hold

Confidence

Basic confidence by using lattice-based posterior + DNN posterior + ROS done
23% detection error achieved by balanced model

LM development

Domain specific LM

h2. domain specific count dumped h2. ngram generation is on going

h2. NUM tag LM:

HCLG union seems better than G union, when integrating grammar + LM (25->23)
Boost specific words like wifi if TAG model does not work for a particular word.

Word2Vector

W2V based doc classification

Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.
Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation

probably over-fitting with the MLP training
SSA-based local linear mapping still on running

Knowledge vector started

document obtained from wiki

Character to word conversion

Design the transform model

RNN LM

Prepare WSJ database
Trained model 10000 x 4 + 320 + 10000
Better performance obtained (4.16-3.47)
gigaword sampling for Chinese data

Speaker ID

Second model done

Emotion detection

delivered to Sinovoice

Translation

v3.0 demo released

still slow

QA

Huilan framework design done
Investigate better framework

@@ 第114行： / 第114行： @@
 * v3.0 demo released
+:* still slow
 ==QA==
-* Framework design done
+* Huilan framework design done
+* Investigate better framework

“2014-09-22”版本间的差异

2014年9月22日 (一) 02:24的最后版本

目录

Resoruce Building

Leftover questions

AM development

Sparse DNN

Noise training

Drop out & Rectification & convolutive network

Denoising & Farfield ASR

VAD

Speech rate training

Scoring

Confidence

LM development

Domain specific LM

Word2Vector

W2V based doc classification

RNN LM

Speaker ID

Emotion detection

Translation

QA

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具