“2014-10-13”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“==Speech Processing == === AM development === ==== Sparse DNN ==== * Performance improvement found when pruned slightly * Experiments show that * Suggest to use TI...”为内容创建页面)
 
Lr讨论 | 贡献
Text Processing
第110行: 第110行:
  
 
* Knowledge vector started
 
* Knowledge vector started
:* document obtained from wiki
+
:* format the data
:* formula obtained
+
  
 
* Character to word conversion
 
* Character to word conversion
:* read more paper .
+
:* prepare the task: word similarity
:* prepare to train .
+
:* prepare the dict.
  
 
* Google word vector train
 
* Google word vector train
第121行: 第120行:
  
 
===RNN LM===
 
===RNN LM===
 
+
*rnn
* Prepare WSJ database
+
*lstm+rnn
* Trained model 10000 x 4 + 320 + 10000
+
: install the tool and prepare the data of wsj
* Better performance obtained (4.16-3.47)
+
: prepare the baseline.
* gigaword sampling for Chinese data
+
 
+
 
===Translation===
 
===Translation===
  
 
* v3.0 demo released
 
* v3.0 demo released
 
:* still slow
 
:* still slow
:* cut the vocabulary that is not important .
+
:* re-segment the word using new dictionary.
 +
:* check new data.
  
 
===QA===
 
===QA===
  
* liangshan_v1 performance 74.3%.
+
* search method:
* New framework and GA method is done
+
:* add the vsm and BM25 to improve the search. and the strategy of selecting the answer
* add SEMPRE tool to framework
+
:* segment the word using minimum granularity for lucene index and bag-of-words method.
 +
* new inter will install SEMPRE

2014年10月13日 (一) 07:41的版本

Speech Processing

AM development

Sparse DNN

  • Performance improvement found when pruned slightly
  • Experiments show that
  • Suggest to use TIMIT / AURORA 4 for training

Noise training

  • First draft of the noisy training journal paper
  • Check abnormal behavior with large sigma (Yinshi, Liuchao)

Drop out & Rectification & convolutive network

  • Drop out
  • No performance improvement found yet.
  • [1]
  • Rectification
  • Dropout NA problem was caused by large magnitude of weights
  • Convolutive network
  1. Test more configurations
  • Zhiyong will work on CNN
  • Recurrent neural network
  • investigate CURRENNT for AM


Denoising & Farfield ASR

  • Lasso-based de-reverberation is done with the REVERBERATION toolkit
  • Start to compose the experiment section for the SL paper.

VAD

  • problems found at the beginning part of speech (0-0.02s?)
  • Noise model training done. Under testing.
  • Need to investigate the performance reduction in babble noise. Call Jia.


Speech rate training

  • Some interesting results with the simple speech rate change algorithm was obtained on the WSJ db

[2]

  • Seems ROS model is superior to the normal one with faster speech
  • Need to check distribution of ROS on WSJ
  • Suggest to extract speech data of different ROS, construct a new test set
  • Suggest to use Tencent training data
  • Suggest to remove silence when compute ROS

low resource language AM training

  • Results on CVSS[3]
  • Use Chinese NN as initial NN, change the last layer

Scoring

  • global scoring done.
  • Pitch & rhythm done, need testing
  • Harmonics hold


Confidence

  • experiments done, need more data
  • Basic confidence by using lattice-based posterior + DNN posterior + ROS done
  • 23% detection error achieved by balanced model

Speaker ID

  • Add VAD to system
  • GMM-based test program delivered
  • GMM registration program done

Emotion detection

  • Zhang Weiwei is learning the code
  • Sinovoice is implementing the server


Text Processing

LM development

Domain specific LM

h2. ngram generation is on going h2. look the memory and baidu_hi done

h2. NUM tag LM:

  • maxi work is released.
  • yuanbin continue the tag lm work.
  • add the ner to tag lm .
  • Boost specific words like wifi if TAG model does not work for a particular word.


Word2Vector

W2V based doc classification

  • Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.
  • Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation
  • SSA-based local linear mapping still on running.
  • k-means classes change to 2.
  • Knowledge vector started
  • format the data
  • Character to word conversion
  • prepare the task: word similarity
  • prepare the dict.
  • Google word vector train
  • improve the sampling method

RNN LM

  • rnn
  • lstm+rnn
install the tool and prepare the data of wsj
prepare the baseline.

Translation

  • v3.0 demo released
  • still slow
  • re-segment the word using new dictionary.
  • check new data.

QA

  • search method:
  • add the vsm and BM25 to improve the search. and the strategy of selecting the answer
  • segment the word using minimum granularity for lucene index and bag-of-words method.
  • new inter will install SEMPRE