“NLP Status Report 2017-6-5”版本间的差异
来自cslt Wiki
第15行: | 第15行: | ||
Shrink output vocab from 30000 to 20000 and best result is 31.53 | Shrink output vocab from 30000 to 20000 and best result is 31.53 | ||
Train the model with 40 batch size and best result until now is 30.63 | Train the model with 40 batch size and best result until now is 30.63 | ||
− | | | + | |
+ | || | ||
* test more checkpoints on model trained with batch = 40 | * test more checkpoints on model trained with batch = 40 | ||
* train model with reverse output | * train model with reverse output | ||
− | |||
− | |||
|- | |- | ||
|Shiyue Zhang || | |Shiyue Zhang || |
2017年6月5日 (一) 05:45的版本
Date | People | Last Week | This Week |
---|---|---|---|
2017/6/5 | Jiyuan Zhang | ||
Aodong LI |
Only make the English encoder's embedding constant -- 45.98 Only initialize the English encoder's embedding and then finetune it -- 46.06 Share the attention mechanism and then directly add them -- 46.20
Shrink output vocab from 30000 to 20000 and best result is 31.53 Train the model with 40 batch size and best result until now is 30.63 |
| |
Shiyue Zhang | |||
Shipan Ren |