“NLP Status Report 2017-5-31”版本间的差异

2017年5月31日 (三) 09:00的最后版本

Date

People

Last Week

This Week

2017/5/31

Jiyuan Zhang

Aodong LI

code double-attention model with final_attn = alpha * attn_ch + beta * attn_en
baseline bleu = 43.87
experiments with random initialized embedding:

alpha	beta	result (bleu)
1	1	43.50
4/3	2/3	43.58 (w/o retrained)
2/3	4/3	41.22 (w/o retrained)
2/3	4/3	42.36 (w/ retrained)

experiments with constant initialized embedding:

alpha	beta	result (bleu)
1	1	45.41
4/3	2/3	45.79
2/3	4/3	45.32

1.4~1.9 BLEU score improvement
This model is similar to multi-source neural translation but uses less resource

Test the model on big data
Explore different attention merge strategies
Explore hierarchical model

Shiyue Zhang

found dropout bug, fix it, and reran baseline: baseline 35.21, baseline(outproj=emb) 35.24
tried several embed set models, failed
embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)

30000	50000	70000	90000
35.24	34.52	33.73	33.16
4564 (6666)	4535	4469	4426

m-nmt is running

get word2vec on big data, and compare with word2vec from train data
test m-nmt model, increase vocab size and test
review zh-uy/uy-zh related works, start to write paper

Shipan Ren

writed document of tf_translate project
read neural machine translation paper
read tf_translate code
run and tested tf_translate code

@@ 第2行： / 第2行： @@
 !Date !! People !! Last Week !! This Week
 |-
-| rowspan="6"|2017/5/22
+| rowspan="6"|2017/5/31
 |Jiyuan Zhang ||
 ||
 |-
 |Aodong LI ||
+* code double-attention model with '''final_attn = alpha * attn_ch + beta * attn_en'''
+* baseline bleu = '''43.87'''
+* experiments with '''random''' initialized embedding:
+{| class="wikitable"
+|-
+! alpha
+! beta
+! result (bleu)
+|-
+| 1
+| 1
+| 43.50
+|-
+| 4/3
+| 2/3
+| 43.58 (w/o retrained)
+|-
+| 2/3
+| 4/3
+| 41.22 (w/o retrained)
+|-
+| 2/3
+| 4/3
+| 42.36 (w/ retrained)
+|}
+* experiments with '''constant''' initialized embedding:
+{| class="wikitable"
+|-
+! alpha
+! beta
+! result (bleu)
+|-
+| 1
+| 1
+| '''45.41'''
+|-
+| 4/3
+| 2/3
+| '''45.79'''
+|-
+| 2/3
+| 4/3
+| '''45.32'''
+|}
+* 1.4~1.9 BLEU score improvement
+* This model is similar to multi-source neural translation but uses less resource
 ||
+* Test the model on big data
+* Explore different attention merge strategies
+* Explore hierarchical model
 |-
 |Shiyue Zhang ||
@@ 第15行： / 第62行： @@
 * tried several embed set models, failed
 * embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)
-** 30000   35.24
+{| class="wikitable"
-**
+|-
+! 30000
+! 50000
+! 70000
+! 90000
+|-
+| 35.24
+| 34.52
+| 33.73
+| 33.16
+|-
+| 4564 (6666)
+| 4535
+| 4469
+| 4426
+|}
+* m-nmt is running
 ||
+* get word2vec on big data, and compare with word2vec from train data
+* test m-nmt model, increase vocab size and test
+* review zh-uy/uy-zh related works, start to write paper
 |-
 |Shipan Ren ||
+* writed document of tf_translate project
+* read neural machine translation paper
+* read tf_translate code
+* run and tested tf_translate code
 ||

“NLP Status Report 2017-5-31”版本间的差异

2017年5月31日 (三) 09:00的最后版本

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具