2015年9月9日 (三) 07:54的版本

Speech Processing

AM development

Environment

grid-12 GPU is transferred to grid-18

RNN AM

train monophone RNN --zhiyuan

decode using 5-gram
the train method of batch

train using large dataset--mengyuan

MPE has NAN problem

write code to tune learning rate--zhiyong

has completed Nestrov/Adagrad/Adagrad-max
has unstable phenomenon

Mic-Array

hold
compute EER with kaldi

====Data selection unsupervised learning

hold
acoustic feature based submodular using Pingan dataset --zhiyong
write code to speed up --zhiyong

RNN-DAE(Deep based Auto-Encode-RNN)

RNN-DAE has worse performance than DNN-DAE because training dataset is small
extract real room impulse to generate WSJ reverberation data, and then train RNN-DAE

Ivector&Dvector based ASR

Cluster the speakers to speaker-cluster

hold

dark knowledge

has much worse performance than baseline (EER: base 29% dark knowledge 48%)

RNN ivector

hold

binary ivector done

language vector

hold
train using language vector with the dataset of 1400h_CN + 100h_EN--mengyuan
write a paper--zhiyuan
RNN language vector

hold

multi-GPU=

multi-stream training --Sheng Su

two GPUs work well, but four GPUs divergent

solve the problem of buffer--Mengyuan, Sheng Su

Neutral picture style transfer

reproduced the result of the paper "A neutral algorithm of artistic style" --Zhiyuan, Xuewei
while subject to the GPU's memory, limited to inception net with sgd optimizer (VGG network with the default L-BFGS optimizer consumes very much memory, which is better)

Text Processing

RNN LM

character-lm rnn(hold)
lstm+rnn

check the lstm-rnnlm code about how to Initialize and update learning rate.(hold)

Neural Based Document Classification

(hold)

RNN Rank Task

Test.
Paper: RNN Rank Net.

(hold)
Output rank information.

Graph RNN

Entity path embeded to entity.

(hold)

RNN Word Segment

Set bound to word segment.

(hold)

Seq to Seq(09-15)

Review papers.
Reproduce baseline. (08-03 <--> 08-17)

Order representation

Nested Dropout

semi-linear --> neural based auto-encoder.

modify the objective function(hold)

Balance Representation

Find error signal

Recommendation

Reproduce baseline.

LDA matrix dissovle.
LDA (Text classification & Recommendation System) --> AAAI

RNN based QA

Read Source Code.
Attention based QA.
Coding.

RNN Poem Process

Seq based BP.

(hold)

Text Group Intern Project

Buddhist Process

(hold)

RNN Poem Process

Done by Haichao yu & Chaoyuan zuo Mentor : Tianyi Luo.

RNN Document Vector

(hold)

Image Baseline

Demo Release.
Paper Report.

Read CNN Paper.

Text Intuitive Idea

Trace Learning

(Hold)

Match RNN

(Hold)

financial group

model research

RNN

online model, update everyday
modify cost function and learning method
add more feature

rule combination

rule analysis

basic rule

classical tenth model

display

bug fixed

index calculation
buy rule fixed

document

data

data api
download data

@@ 第3行： / 第3行： @@
 ==== Environment ====
-* grid-12 sometimes does not work
+* grid-12 GPU is transferred to grid-18
 ==== RNN AM====
 *train monophone RNN --zhiyuan
+:* decode using 5-gram
+:* the train method of batch
 *train using large dataset--mengyuan
+:* MPE has NAN problem
 *write code to tune learning rate--zhiyong
+:* has completed Nestrov/Adagrad/Adagrad-max
+:* has unstable phenomenon
 ==== Mic-Array ====
@@ 第22行： / 第27行： @@
 ====RNN-DAE(Deep based Auto-Encode-RNN)====
-* hold
+* RNN-DAE has worse performance than DNN-DAE because training dataset is small
-* deliver to mengyuan, xuewei
+* extract real room impulse to generate WSJ reverberation data, and then train RNN-DAE
-:* http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=zhangzy&step=view_request&cvssid=261
 ===Ivector&Dvector based ASR===
-* Cluster the speakers to speaker-classes, then using the distance or the posterior-probability as the metric
+* Cluster the speakers to speaker-cluster
-:*hold
+:* hold
-* dark-konowlege using i-vector
+* dark knowledge
+:* has much worse performance than baseline (EER: base 29%  dark knowledge 48%)
 * RNN ivector
-* binary ivector
+:* hold
+* binary ivector done
 ===language vector===
 * hold
 * train using language vector with the dataset of 1400h_CN + 100h_EN--mengyuan
-:* hold
 * write a paper--zhiyuan
 * RNN language vector
@@ 第45行： / 第49行： @@
 ===multi-GPU====
 * multi-stream training --Sheng Su
-* solve the problem of buffer--Mengyuan
+:* two GPUs work well, but four GPUs divergent
+* solve the problem of buffer--Mengyuan, Sheng Su
+===Neutral picture style transfer===
+* reproduced the result of the paper "A neutral algorithm of artistic style" --Zhiyuan, Xuewei
+* while subject to the GPU's memory, limited to inception net with sgd optimizer (VGG network with the default L-BFGS optimizer consumes very much memory, which is better)
 ==Text Processing==

“ASR:2015-09-09”版本间的差异

2015年9月9日 (三) 07:54的版本

目录

Speech Processing

AM development

Environment

RNN AM

Mic-Array

RNN-DAE(Deep based Auto-Encode-RNN)

Ivector&Dvector based ASR

language vector

multi-GPU=

Neutral picture style transfer

Text Processing

RNN LM

Neural Based Document Classification

RNN Rank Task

Graph RNN

RNN Word Segment

Seq to Seq(09-15)

Order representation

Balance Representation

Recommendation

RNN based QA

RNN Poem Process

Text Group Intern Project

Buddhist Process

RNN Poem Process

RNN Document Vector

Image Baseline

Text Intuitive Idea

Trace Learning

Match RNN

financial group

model research

rule combination

basic rule

display

data

导航菜单

搜索