“NLP Status Report 2017-6-5”版本间的差异
来自cslt Wiki
(2位用户的6个中间修订版本未显示) | |||
第2行: | 第2行: | ||
!Date !! People !! Last Week !! This Week | !Date !! People !! Last Week !! This Week | ||
|- | |- | ||
− | | rowspan="6"|2017/ | + | | rowspan="6"|2017/6/5 |
|Jiyuan Zhang || | |Jiyuan Zhang || | ||
|| | || | ||
第12行: | 第12行: | ||
Share the attention mechanism and then directly add them -- 46.20 | Share the attention mechanism and then directly add them -- 46.20 | ||
* big data baseline bleu = '''30.83''' | * big data baseline bleu = '''30.83''' | ||
− | * | + | * Model with three fixed embeddings |
+ | Shrink output vocab from 30000 to 20000 and best result is 31.53 | ||
+ | Train the model with 40 batch size and best result until now is 30.63 | ||
+ | |||
+ | || | ||
+ | * test more checkpoints on model trained with batch = 40 | ||
+ | * train model with reverse output | ||
+ | |- | ||
+ | |Shiyue Zhang || | ||
+ | * trained word2vec on big data, and directly used it on NMT, but resulted in quite poor performance | ||
+ | * trained M-NMT model, got bleu=36.58 (+1.34 than NMT). But found the EOS in mem has a big influence on result: | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
− | ! | + | ! NMT |
− | ! | + | ! 35.24, 57.7/39.8/31.9/27.0 BP=0.939 |
− | + | ||
|- | |- | ||
− | | 1 | + | |MNMT (EOS=1) |
− | | 1 | + | | 35.27, 60.0/41.3/33.1/28.0 BP=0.907 |
− | + | ||
|- | |- | ||
− | | | + | | MNMT (EOS=0.2) |
− | + | | 36.40, 59.1/40.8/32.6/27.4 BP=0.951 | |
− | | | + | |
|- | |- | ||
− | | | + | | MNMT (EOS=0) |
− | | 4/ | + | | 36.58, 58.4/40.4/32.1/27.0 BP=0.968 |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
|} | |} | ||
− | * | + | * tried to tackle UNK using 36.58 M-NMT, increased vocab to 50000, got bleu=35.63, 58.6/40.0/31.6/26.4 BP=0.953 (not good, ?) |
− | + | * training uy-zh, 50% zh-uy, 25% zh-uy | |
+ | * training mem without EOS | ||
+ | * reviewing related papers | ||
|| | || | ||
− | + | * solve EOS problem | |
− | + | * find way to tackle UNK | |
− | + | * write paper | |
− | + | ||
− | + | ||
− | + | ||
|- | |- | ||
|Shipan Ren || | |Shipan Ren || |
2017年6月5日 (一) 06:04的最后版本
Date | People | Last Week | This Week | |||||||
---|---|---|---|---|---|---|---|---|---|---|
2017/6/5 | Jiyuan Zhang | |||||||||
Aodong LI |
Only make the English encoder's embedding constant -- 45.98 Only initialize the English encoder's embedding and then finetune it -- 46.06 Share the attention mechanism and then directly add them -- 46.20
Shrink output vocab from 30000 to 20000 and best result is 31.53 Train the model with 40 batch size and best result until now is 30.63 |
| ||||||||
Shiyue Zhang |
|
| ||||||||
Shipan Ren |