2017年9月4日 (一) 07:41的最后版本

NLP Schedule

Members

Current Members

Yang Feng (冯洋)
Jiyuan Zhang （张记袁）
Aodong Li (李傲冬)
Andi Zhang (张安迪)
Shiyue Zhang (张诗悦)
Li Gu (古丽)
Peilun Xiao (肖培伦)
Shipan Ren (任师攀)
Jiayu Guo (郭佳雨)

Former Members

Chao Xing (邢超) : FreeNeb
Rong Liu (刘荣) : 优酷
Xiaoxi Wang (王晓曦) : 图灵机器人
Xi Ma (马习) : 清华大学研究生
Tianyi Luo (骆天一) ： phd candidate in University of California Santa Cruz
Qixin Wang (王琪鑫) : MA candidate in University of California
DongXu Zhang (张东旭): --
Yiqiao Pan (潘一桥) ： MA candidate in University of Sydney
Shiyao Li （李诗瑶） : BUPT
Aiting Liu (刘艾婷) : BUPT

Work Progress

Daily Report

}

Time Off Table

Date	Person	start	leave	hours	status
2017/04/02	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/02	Peilun Xiao
2017/04/03	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/03	Peilun Xiao
2017/04/04	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/04	Peilun Xiao
2017/04/05	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/05	Peilun Xiao
2017/04/06	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/06	Peilun Xiao
2017/04/07	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/07	Peilun Xiao
2017/04/08	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/08	Peilun Xiao
2017/04/09	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/09	Peilun Xiao
2017/04/10	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/10	Peilun Xiao
2017/04/11	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/11	Peilun Xiao
2017/04/12	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/12	Peilun Xiao
2017/04/13	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/13	Peilun Xiao
2017/04/14	Andy Zhang	9:30	18:30	8	preparing EMNLP
2017/04/14	Peilun Xiao
2017/04/15	Andy Zhang	9:00	15:00	6	preparing EMNLP
2017/04/15	Peilun Xiao
2017/04/18	Aodong Li	11:00	20:00	8	Pick up new task in news generation and do literature review
2017/04/19	Aodong Li	11:00	20:00	8	Literature review
2017/04/20	Aodong Li	12:00	20:00	8	Literature review
2017/04/21	Aodong Li	12:00	20:00	8	Literature review
2017/04/24	Aodong Li	11:00	20:00	8	Adjust literature review focus
2017/04/25	Aodong Li	11:00	20:00	8	Literature review
2017/04/26	Aodong Li	11:00	20:00	8	Literature review
2017/04/27	Aodong Li	11:00	20:00	8	Try to reproduce sc-lstm work
2017/04/28	Aodong Li	11:00	20:00	8	Transfer to new task in machine translation and do literature review
2017/04/30	Aodong Li	11:00	20:00	8	Literature review
2017/05/01	Aodong Li	11:00	20:00	8	Literature review
2017/05/02	Aodong Li	11:00	20:00	8	Literature review and code review
2017/05/06	Aodong Li	14:20	17:20	3	Code review
2017/05/07	Aodong Li	13:30	22:00	8	Code review and experiment started, but version discrepancy encountered
2017/05/08	Aodong Li	11:30	21:00	8	Code review and version discrepancy solved
2017/05/09	Aodong Li	13:00	22:00	9	Code review and experiment details about experiment: small data, 1st and 2nd translator uses the same training data, 2nd translator uses random initialized embedding results (BLEU): BASELINE: 43.87 best result of our model: 42.56
2017/05/10	Shipan Ren	9:00	20:00	11	Entry procedures Machine Translation paper reading
2017/05/10	Aodong Li	13:30	22:00	8	experiment setting: small data, 1st and 2nd translator uses the different training data, counting 22000 and 22017 seperately 2nd translator uses random initialized embedding results (BLEU): BASELINE: 36.67 (36.67 is the model at 4750 updates, but we use model at 3000 updates to prevent the case of overfitting, to generate the 2nd translator's training data, for which the BLEU is 34.96) best result of our model: 29.81 This may suggest that that using either the same training data with 1st translator or different one won't influence 2nd translator's performance, instead, using the same one may be better, at least from results. But I have to give a consideration of a smaller size of training data compared to yesterday's model. code 2nd translator with constant embedding
2017/05/11	Shipan Ren	10:00	19:30	9.5	Configure environment Run tf_translate code Read Machine Translation paper
2017/05/11	Aodong Li	13:00	21:00	8	experiment setting: small data, 1st and 2nd translator uses the same training data, 2nd translator uses constant untrainable embedding imported from 1st translator's decoder results (BLEU): BASELINE: 43.87 best result of our model: 43.48 Experiments show that this kind of series or cascade model will definitely impair the final perfor- mance due to information loss as the information flows through the network from end to end. Decoder's smaller vocabulary size compared to encoder's demonstrate this (9000+ -> 6000+). The intention of this experiment is looking for a map to solve meaning shift using 2nd translator, but result of whether the map is learned or not is obscured by the smaller vocab size phenomenon. literature review on hierarchical machine translation
2017/05/12	Aodong Li	13:00	21:00	8	Code double decoding model and read multilingual MT paper
2017/05/13	Shipan Ren	10:00	19:00	9	read machine translation paper learne lstm model and seq2seq model
2017/05/14	Aodong Li	10:00	20:00	9	Code double decoding model and experiment details about experiment: small data, 2nd translator uses as training data the concat(Chinese, machine translated English), 2nd translator uses random initialized embedding results (BLEU): BASELINE: 43.87 best result of our model: 43.53 NEXT: 2nd translator uses trained constant embedding
2017/05/15	Shipan Ren	9:30	19:00	9.5	understand the difference between lstm model and gru model read the implement code of seq2seq model
2017/05/17	Shipan Ren	9:30	19:30	10	read neural machine translation paper read tf_translate code
2017/05/17	Aodong Li	13:30	24:00	9	code and debug double-decoder model alter 2017/05/14 model's size and will try after nips
2017/05/18	Shipan Ren	10:00	19:00	9	read neural machine translation paper read tf_translate code
2017/05/18	Aodong Li	12:30	21:00	8	train double-decoder model on small data set but encounter decode bugs
2017/05/19	Aodong Li	12:30	20:30	8	debug double-decoder model the model performs well on develop set, but performs badly on test data. I want to figure out the reason.
2017/05/21	Aodong Li	10:30	18:30	8	details about experiment: hidden_size = 700 (500 in prior) emb_size = 510 (310 in prior) small data, 2nd translator uses as training data the concat(Chinese, machine translated English), 2nd translator uses random initialized embedding results (BLEU): BASELINE: 43.87 best result of our model: 45.21 But only one checkpoint outperforms the baseline, the other results are commonly under 43.1 debug double-decoder model
2017/05/22	Aodong Li	14:00	22:00	8	double-decoder without joint loss generalizes very bad i'm trying double-decoder model with joint loss
2017/05/23	Aodong Li	13:00	21:30	8	details about experiment 1: hidden_size = 700 emb_size = 510 learning_rate = 0.0005 (0.001 in prior) small data, 2nd translator uses as training data the concat(Chinese, machine translated English), 2nd translator uses random initialized embedding results (BLEU): BASELINE: 43.87 best result of our model: 42.19 Overfitting? In overall, the 2nd translator performs worse than baseline details about experiment 2: hidden_size = 500 emb_size = 310 learning_rate = 0.001 small data, double-decoder model with joint loss which means the final loss = 1st decoder's loss + 2nd decoder's loss results (BLEU): BASELINE: 43.87 best result of our model: 39.04 The 1st decoder's output is generally better than 2nd decoder's output. The reason may be that the second decoder only learns from the first decoder's hidden states because their states are almost the same. DISCOVERY: The reason why double-decoder without joint loss generalizes very bad is that the gap between force teaching mechanism (training process) and beam search mechanism (decoding process) propagates and expands the error to the output end, which destroys the model when decoding. next: Try to train double-decoder model without joint loss but with beam search on 1st decoder.
2017/05/24	Aodong Li	13:00	21:30	8	code double-attention one-decoder model code double-decoder model
2017/05/24	Shipan Ren	10:00	20:00	10	read neural machine translation paper read tf_translate code
2017/05/25	Shipan Ren	9:30	18:30	9	write document of tf_translate project read neural machine translation paper read tf_translate code
2017/05/25	Aodong Li	13:00	22:00	9	code and debug double attention model
2017/05/27	Shipan Ren	9:30	18:30	9	read tf_translate code write document of tf_translate project
2017/05/28	Aodong Li	15:00	22:00	7	details about experiment: hidden_size = 500 emb_size = 310 learning_rate = 0.001 small data, 2nd translator uses as training data both Chinese and machine translated English Chinese and English use different encoders and different attention final_attn = attn_1 + attn_2 2nd translator uses random initialized embedding results (BLEU): BASELINE: 43.87 when decoding: final_attn = attn_1 + attn_2 best result of our model: 43.50 final_attn = 2/3attn_1 + 4/3attn_2 best result of our model: 41.22 final_attn = 4/3attn_1 + 2/3attn_2 best result of our model: 43.58
2017/05/30	Aodong Li	15:00	21:00	6	details about experiment 1: hidden_size = 500 emb_size = 310 learning_rate = 0.001 small data, 2nd translator uses as training data both Chinese and machine translated English Chinese and English use different encoders and different attention final_attn = 2/3attn_1 + 4/3attn_2 2nd translator uses random initialized embedding results (BLEU): BASELINE: 43.87 best result of our model: 42.36 details about experiment 2: final_attn = 2/3attn_1 + 4/3attn_2 2nd translator uses constant initialized embedding results (BLEU): BASELINE: 43.87 best result of our model: 45.32 details about experiment 3: final_attn = attn_1 + attn_2 2nd translator uses constant initialized embedding results (BLEU): BASELINE: 43.87 best result of our model: 45.41 and it seems more stable
2017/05/31	Shipan Ren	10:00	19:30	9.5	run and test tf_translate code write document of tf_translate project
2017/05/31	Aodong Li	12:00	20:30	8.5	details about experiment 1: final_attn = 4/3attn_1 + 2/3attn_2 2nd translator uses constant initialized embedding results (BLEU): BASELINE: 43.87 best result of our model: 45.79 That only make English word embedding at encoder constant and train all the other embedding and parameters achieves an even higher bleu score 45.98 and the results are stable. The quality of English embedding at encoder plays an pivotal role in this model. Preparation of big data.
2017/06/01	Aodong Li	13:00	24:00	11	Only make the English encoder's embedding constant -- 45.98 Only initialize the English encoder's embedding and then finetune it -- 46.06 Share the attention mechanism and then directly add them -- 46.20 Run double-attention model on large data
2017/06/02	Aodong Li	13:00	22:00	9	Baseline bleu on large data is 30.83 with 30000 output vocab Our best result is 31.53 with 20000 output vocab
2017/06/03	Aodong Li	13:00	21:00	8	Train the model with 40 batch size and with concat(attn_1, attn_2) the best result of model with 40 batch size and with add(attn_1, attn_2) is 30.52
2017/06/05	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper
2017/06/06	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper
2017/06/07	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper
2017/06/08	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper
2017/06/09	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper
2017/06/12	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper
2017/06/13	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper
2017/06/14	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper
2017/06/15	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper Read paper about MT involving grammar
2017/06/16	Aodong Li	10:00	19:00	8	Prepare for APSIPA paper Read paper about MT involving grammar
2017/06/19	Aodong Li	10:00	19:00	8	Completed APSIPA paper Took new task in style translation
2017/06/20	Aodong Li	10:00	19:00	8	Tried synonyms substitution
2017/06/21	Aodong Li	10:00	19:00	8	Tried post edit like synonyms substitution but this didn't work
2017/06/22	Aodong Li	10:00	19:00	8	Trained a GRU language model to determine similar word
2017/06/23	Shipan Ren	10:00	21:00	11	read neural machine translation paper read and run tf_translate code
2017/06/23	Aodong Li	10:00	19:00	8	Trained a GRU language model to determine similar word This didn't work because semantics is not captured
2017/06/26	Shipan Ren	10:00	21:00	11	read paper：LSTM Neural Networks for Language Modeling read and run ViVi_NMT code
2017/06/26	Aodong Li	10:00	19:00	8	Tried to figure out new ways to change the text style
2017/06/27	Shipan Ren	10:00	20:00	10	read the API of tensorflow debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0
2017/06/27	Aodong Li	10:00	19:00	8	Trained seq2seq model to solve this problem Semantics are stored in fixed-length vectors by a encoder and a decoder generate sequences on this vector
2017/06/28	Shipan Ren	10:00	19:00	9	debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server) installed tensorflow0.1 and tensorflow1.0 on my pc and debugged ViVi_NMT
2017/06/28	Aodong Li	10:00	19:00	8	Cross-domain seq2seq w/o attention and w/ attention models didn't work because of overfitting
2017/06/29	Shipan Ren	10:00	20:00	10	read the API of tensorflow debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
2017/06/29	Aodong Li	10:00	19:00	8	Read style transfer papers
2017/06/30	Shipan Ren	10:00	24:00	14	debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server) accomplished this task found the new version saves more time，has lower complexity and better bleu than before
2017/06/30	Aodong Li	10:00	19:00	8	Read style transfer papers
2017/07/03	Shipan Ren	9:00	21:00	12	run two versions of the code on small data sets (Chinese-English) tested these checkpoint
2017/07/04	Shipan Ren	9:00	21:00	12	recorded experimental results found version 1.0 of the code save more training time, has less complexity and these two version of the code has a similar Bleu value found that the Bleu is still good when the model is over fitting reason: the test set and training set are similar in content and style on small data set
2017/07/05	Shipan Ren	9:00	21:00	12	run two versions of the code on big data sets (Chinese-English) read NMT papers
2017/07/06	Shipan Ren	9:00	21:00	12	out of memory（OOM） error occurred when version 0.1 of code was trained using large data set，but version 1.0 worked reason: improper distribution of resources by the tensorflow0.1 version leads to exhaustion of memory resources I've tried many times, and version 0.1 worked
2017/07/07	Shipan Ren	9:00	21:00	12	tested these checkpoints and recorded experimental results the version 1.0 code saved 0.06 second per step than the version 0.1 code
2017/07/08	Shipan Ren	9:00	21:00	12	downloaded the wmt2014 data set used the English-French data set to run the code and found the translation is not good reason:no data preprocessing is done
2017/07/10	Shipan Ren	9:00	20:00	11	trained translation models using tf1.0 baseline and tf0.1 baseline perspectively dataset：zh-en small
2017/07/11	Shipan Ren	9:00	20:00	11	tested these checkpoints found the new version takes less time found these two versions have similar complexity and bleu values found that the bleu is still good when the model is over fitting . (reason: the test set and the train set of small data set are similar in content and style)
2017/07/12	Shipan Ren	9:00	20:00	11	trained translation models using tf1.0 baseline and tf0.1 baseline perspectively dataset：zh-en big
2017/07/13	Shipan Ren	9:00	20:00	11	OOM（Out Of Memory） error occurred when version 0.1 was trained using large data set，but version 1.0 worked reason: improper distribution of resources by the tensorflow0.1 frame leads to exhaustion of memory resources I had tried 4 times （just enter the same command）, and version 0.1 worked
2017/07/14	Shipan Ren	9:00	20:00	11	tested these checkpoints found the new version takes less time found these two versions have similar complexity and bleu values
2017/07/17	Shipan Ren	9:00	20:00	11	downloaded the wmt2014 data sets and processed it
2017/07/18	Shipan Ren	9:00	20:00	11	processed data
2017/07/18	Jiayu Guo	8:30	22:00	14	read model code.
2017/07/19	Shipan Ren	9:00	20:00	11	processed data
2017/07/19	Jiayu Guo	9:00	22:00	13	read papers of bleu.
2017/07/20	Shipan Ren	9:00	20:00	11	processed data
2017/07/20	Jiayu Guo	9:00	22:00	13	read papers of attention mechanism.
2017/07/21	Shipan Ren	9:00	20:00	11	trained translation models using tf1.0 baseline and tf0.1 baseline perspectively dataset:WMT2014 en-de
2017/07/21	Jiayu Guo	10:00	23:00	13	process document
2017/07/24	Shipan Ren	9:00	20:00	11	tested these checkpoints of en-de dataset found the new version takes less time found these two versions have similar complexity and bleu values
2017/07/24	Jiayu Guo	9:00	22:00	13	read model code.
2017/07/25	Shipan Ren	9:00	20:00	11	trained translation models using tf1.0 baseline and tf0.1 baseline perspectively dataset:WMT2014 en-fr datasets
2017/07/25	Jiayu Guo	9:00	23:00	14	process document
2017/07/26	Shipan Ren	9:00	20:00	11	read papers about memory-augmented nmt
2017/07/26	Jiayu Guo	10:00	24:00	14	process document
2017/07/27	Shipan Ren	9:00	20:00	11	read papers about memory-augmented nmt
2017/07/27	Jiayu Guo	10:00	24:00	14	process document
2017/07/28	Shipan Ren	9:00	20:00	11	read memory-augmented nmt code
2017/07/28	Jiayu Guo	9:00	24:00	15	process document
2017/07/31	Shipan Ren	9:00	20:00	11	read memory-augmented nmt code
2017/07/31	Jiayu Guo	10:00	23:00	13	split ancient language text to single word
2017/08/1	Shipan Ren	9:00	20:00	11	tested these checkpoints of en-fr dataset found the new version takes less time found these two versions have similar complexity and bleu values
2017/08/1	Jiayu Guo	10:00	23:00	13	run seq2seq_model
2017/08/2	Shipan Ren	9:00	20:00	11	looked for the performance(the bleu value) of other models datasets:WMT2014 en-de and en-fr
2017/08/2	Jiayu Guo	10:00	23:00	13	process document
2017/08/3	Shipan Ren	9:00	20:00	11	looked for the performance(the bleu value) of other seq2seq models datasets:WMT2014 en-de and en-fr
2017/08/3	Jiayu Guo	10:00	23:00	13	process document
2017/08/4	Shipan Ren	9:00	20:00	11	learn moses
2017/08/4	Jiayu Guo	10:00	23:00	13	search new data(Songshu)
2017/08/7	Shipan Ren	9:00	20:00	11	installed and built Moses on the server
2017/08/7	Jiayu Guo	9:00	22:00	13	process document
2017/08/8	Shipan Ren	9:00	20:00	11	train statistical machine translation model and test it dataset:zh-en small test if moses can work normally
2017/08/8	Jiayu Guo	10:00	21:00	11	read tensorflow
2017/08/9	Shipan Ren	9:00	20:00	11	code automation scripts to process data,train model and test model toolkit: Moses
2017/08/9	Jiayu Guo	10:00	23:00	13	run model with the data of which ancient content was split by single character.
2017/08/10	Shipan Ren	9:00	20:00	11	train statistical machine translation models and test it dataset:zh-en big,WMT2014 en-de,WMT2014 en-fr
2017/08/10	Jiayu Guo	9:00	23:00	13	process data of Songshu read papers of CNN
2017/08/11	Shipan Ren	9:00	20:00	11	collate experimental results compare our baseline model with Moses
2017/08/11	Jiayu Guo	9:00	20:00	11	test results.
2017/08/14	Shipan Ren	9:00	20:00	11	read paper about THUMT
2017/08/14	Jiayu Guo	10:00	23:00	13	learn about Graphic Model of LSTM-Projected BPTT search for data available for translation (Twenty-four-Shi)
2017/08/15	Shipan Ren	9:00	20:00	11	read THUMT manual and learn how to use it
2017/08/15	Jiayu Guo	11:00	23:30	12	run model with data including Shiji、Zizhitongjian.
2017/08/16	Shipan Ren	9:00	20:00	11	train translation models and test them toolkit: THUMT dataset:zh-en small test if THUMT can work normally
2017/08/16	Jiayu Guo	10:00	23:00	10	checkpoint-100000 translation model BLEU： 11.11 source:在秦者名错，与张仪争论,於是惠王使错将伐蜀，遂拔，因而守之。 target:在秦国的名叫司马错，曾与张仪发生争论，秦惠王采纳了他的意见，于是司马错率军攻蜀国，攻取后，又让他做了蜀地郡守。 trans：当时秦国的人都很欣赏他的建议，与张仪一起商议，所以吴王派使者率军攻打蜀地，一举攻，接着又下令守城。 source:神大用则竭，形大劳则敝，形神离则死。 target:精神过度使用就会衰竭，形体过度劳累就会疲惫，神形分离就会死亡。 trans: 精神过度就可衰竭,身体过度劳累就会疲惫，地形也就会死。 source:今天子接千岁之统，封泰山，而余不得从行，是命也夫，命也夫！ target:现天子继承汉朝千年一统的大业，在泰山举行封禅典礼而我不能随行，这是命啊，是命啊！ trans: 现在天子可以继承帝位的成就爵位，爵位至泰山，而我却未能执行先帝的命运。 1.data used Zizhitongjian only(6,000 pairs), we can get BLEU 6 at most. 2.data used Zizhitongjian only(12,000 pairs), we can get BLEU 7 at most. 3.data used Shiji and Zizhitongjian(43,0000 pairs), we can get BLEU about 9. 4.data used Shiji and Zizhitongjian(43,0000 pairs), and split the ancient language text one character by one, we can get BLEU 11.11 at most.
2017/08/17	Shipan Ren	9:00	20:00	11	code automation scripts to process data,train model and test model train translation models and test them toolkit: THUMT dataset:zh-en big
2017/08/17	Jiayu Guo	13:00	23:00	10	read source code.
2017/08/18	Shipan Ren	9:00	20:00	11	test translation models by using single reference and multiple reference organize all the experimental results(our baseline system,Moses,THUMT)
2017/08/18	Jiayu Guo	13:00	22:00	9	read source code.
2017/08/21	Shipan Ren	10:00	22:00	12	read the released information of other translation systems
2017/08/21	Jiayu Guo	9:30	21:30	12	read the source code and learn tensorflow
2017/08/22	Shipan Ren	10:00	22:00	12	cleaned up the code
2017/08/22	Jiayu Guo	9:00	22:00	12	read the source code
2017/08/23	Shipan Ren	10:00	21:00	11	wrote the documents
2017/08/23	Jiayu Guo	9:00	22:00	11	read the source code and learn tensorflow
2017/08/24	Shipan Ren	10:00	20:00	10	wrote the documents
2017/08/24	Jiayu Guo	9:10	22:00	10.5	read the source code and learn tensorflow
2017/08/25	Shipan Ren	10:00	20:00	10	check experimental results
2017/08/25	Jiayu Guo	8:50	22:00	10.5	read the source code and learn tensorflow
2017/08/28	Shipan Ren	10:00	20:00	10	wrote the paper of ViVi_NMT(version 1.0)
2017/08/28	Jiayu Guo	8:10	21:00	11	read the source code and learn tensorflow
2017/08/29	Shipan Ren	10:00	20:00	10	wrote the paper of ViVi_NMT(version 1.0)
2017/08/29	Jiayu Guo	11:00	21:00	10	read the source code and learn tensorflow
2017/08/30	Shipan Ren	10:00	20:00	10	wrote the paper of ViVi_NMT(version 1.0)
2017/08/30	Jiayu Guo	11:30	21:00	9	learn VV model
2017/08/31	Shipan Ren	10:00	20:00	10	wrote the paper of ViVi_NMT(version 1.0)
2017/08/31	Jiayu Guo	10:00	20:00	10	clean up the code

Date	Yang Feng	Jiyuan Zhang

Past progress

nlp-progress 2016/05-07

nlp-progress 2016/04

“Schedule”版本间的差异

2017年9月4日 (一) 07:41的最后版本

目录

NLP Schedule

Members

Current Members

Former Members

Work Progress

Daily Report

Time Off Table

Past progress

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具

@@ 第9行： / 第9行： @@
 * Aodong Li (李傲冬)
 * Andi Zhang (张安迪)
-* Ziwei Bai （白子薇）
-* Aiting Liu (刘艾婷)
-* Shiyao Li （李诗瑶）
 * Shiyue Zhang (张诗悦)
+* Li Gu (古丽)
+* Peilun Xiao (肖培伦)
+* Shipan Ren (任师攀)
+* Jiayu Guo (郭佳雨)
 ===Former Members===
@@ 第23行： / 第24行： @@
 * '''DongXu Zhang (张东旭)''': --
 * '''Yiqiao Pan (潘一桥)'''  ： MA candidate in University of Sydney
+* '''Shiyao Li （李诗瑶）''' :  BUPT
+* '''Aiting Liu (刘艾婷)'''  :  BUPT
 ==Work Progress==
@@ 第31行： / 第33行： @@
 ! Date !! Person  !! start!! leave !! hours ||status
 |-
-| rowspan="4"|2016/09/01
+| rowspan="2"|2017/04/02
-|Aodong Li ||  ||   ||  ||
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Ziwei Bai ||  ||   ||  ||
+|Peilun Xiao || || || ||
 |-
-|Andy Zhang|| 9:30 || 18:00  || 7.5 || read papers about neural turing machine & memory networks
+| rowspan="2"|2017/04/03
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Shiyao Li || 11:40 || 18:00  || 5 || learn word2vec using tensorflow
+|Peilun Xiao || || || ||
 |-
-| rowspan="4"|2016/09/02
+| rowspan="2"|2017/04/04
-|Aodong Li ||  ||   ||  ||
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Ziwei Bai ||  ||   ||  ||
+|Peilun Xiao || || || ||
 |-
-|Andy Zhang|| 9:30 || 18:30  || 8 || continued reading papers got yesterday, mainly focus on the implementation of the models.
+| rowspan="2"|2017/04/05
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Shiyao Li || 9:30 || 18:30  || 8 || learn LSTM using tensorflow and get to know some applications about generating text and sequence
+|Peilun Xiao || || || ||
 |-
-| rowspan="4"|2016/09/05
+| rowspan="2"|2017/04/06
-|Aodong Li || 12:50 ||  20:00 || 7 || Reconsider the rare word mechanism and do some experiments to try to reproduce the paper's method
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Ziwei Bai ||  ||   ||  ||
+|Peilun Xiao || || || ||
 |-
-|Andy Zhang ||  ||   ||  ||
+| rowspan="2"|2017/04/07
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Shiyao Li ||  ||   ||  ||
+|Peilun Xiao || || || ||
 |-
-| rowspan="5"|2016/09/06
+| rowspan="2"|2017/04/08
-|Aodong Li || 15:40 || 19:50 || 5 ||
+|Andy Zhang||9:30 ||18:30 ||8 ||
-*Do the baseline experiment but the performance is poor, so I may conduct it wrong
+*preparing EMNLP
-*Help Lantian for ICASSP with codes
 |-
-|Ziwei Bai ||  ||   ||  ||
+|Peilun Xiao || || || ||
 |-
-|Andy Zhang ||  ||   ||  ||
+| rowspan="2"|2017/04/09
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Shiyao Li ||  ||   ||  ||
+|Peilun Xiao || || || ||
 |-
-|Shiyue Zhang || 14:30 || 20:00 || 5 ||
+| rowspan="2"|2017/04/10
-*Getting familiar with ASR engine and prepare for the document
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-| rowspan="5"|2016/09/07
+|Peilun Xiao || || || ||
-|Aodong Li || 14:10 || 20:50 || 6.5 ||
-*Try different settings for rare word baseline experiment but still got wrong performance
-*Read the paper "Pointing the unknown words"
 |-
-|Ziwei Bai ||  ||   ||  ||
+| rowspan="2"|2017/04/11
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Andy Zhang ||  ||   ||  ||
+|Peilun Xiao || || || ||
 |-
-|Shiyao Li ||  ||   ||  ||
+| rowspan="2"|2017/04/12
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Shiyue Zhang || 9:30|| 19:30 ||8.5 ||
+|Peilun Xiao || || || ||
-*Read ASR code and write the document
 |-
-| rowspan="5"|2016/09/08
+| rowspan="2"|2017/04/13
-|Aodong Li ||  ||  ||  ||
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Ziwei Bai ||  ||   ||  ||
+|Peilun Xiao || || || ||
 |-
-|Andy Zhang ||  ||   ||  ||
+| rowspan="2"|2017/04/14
+|Andy Zhang||9:30 ||18:30 ||8 ||
+*preparing EMNLP
 |-
-|Shiyao Li ||  ||   ||  ||
+|Peilun Xiao || || || ||
 |-
-|Shiyue Zhang || 9:30|| 16:30 || 5 ||
+| rowspan="2"|2017/04/15
-*Read the paper "Neural Machine Translation By Jointly Learning"
+|Andy Zhang||9:00 ||15:00 ||6 ||
+*preparing EMNLP
 |-
-| rowspan="5"|2016/09/09
+|Peilun Xiao || || || ||
-|Aodong Li ||  ||  ||  ||
 |-
-|Ziwei Bai ||  ||   ||  ||
+| rowspan="1"|2017/04/18
+|Aodong Li||11:00 ||20:00 ||8 ||
+*Pick up new task in news generation and do literature review
 |-
-|Andy Zhang ||  ||   ||  ||
+| rowspan="1"|2017/04/19
+|Aodong Li||11:00 ||20:00 ||8 ||
+*Literature review
 |-
-|Shiyao Li ||  ||   ||  ||
+| rowspan="1"|2017/04/20
+|Aodong Li||12:00 ||20:00 ||8 ||
+*Literature review
 |-
-|Shiyue Zhang || 14:00|| 19:30 || 5 ||
+| rowspan="1"|2017/04/21
-*Read the paper "Memory Networks" and finish the document
+|Aodong Li||12:00 ||20:00 ||8 ||
+*Literature review
 |-
-| rowspan="5"|2016/09/10
+| rowspan="1"|2017/04/24
-|Aodong Li ||  ||   || ||
+|Aodong Li||11:00 ||20:00 ||8 ||
+*Adjust literature review focus
 |-
-|Ziwei Bai || 11:00 || 20:00  || 8 || test the chatting model 60-160
+| rowspan="1"|2017/04/25
+|Aodong Li||11:00 ||20:00 ||8 ||
+*Literature review
 |-
-|Andy Zhang ||  ||   ||  ||
+| rowspan="1"|2017/04/26
+|Aodong Li||11:00 ||20:00 ||8 ||
+*Literature review
 |-
-|Shiyao Li ||  ||   ||  ||
+| rowspan="1"|2017/04/27
+|Aodong Li||11:00 ||20:00 ||8 ||
+*Try to reproduce sc-lstm work
 |-
-|Shiyue Zhang || || || ||
+| rowspan="1"|2017/04/28
+|Aodong Li||11:00 ||20:00 ||8 ||
+*Transfer to new task in machine translation and do literature review
 |-
-| rowspan="5"|2016/09/12
+| rowspan="1"|2017/04/30
-|Aodong Li ||  ||   || ||
+|Aodong Li||11:00 ||20:00 ||8 ||
+*Literature review
 |-
-|Ziwei Bai || 9:15 || 18:15  || 8 || prepare for the slide for Chatting Model(CDSSM & NRM)
+| rowspan="1"|2017/05/01
+|Aodong Li||11:00 ||20:00 ||8 ||
+*Literature review
 |-
-|Andy Zhang || 9:15 || 18:30  || 8+ || prepared for the paper sharing
+| rowspan="1"|2017/05/02
-|-A
+|Aodong Li||11:00 ||20:00 ||8 ||
-|Shiyao Li || 9:15 || 18:15  || 8 || read the paper Finding the Middle Ground-A Model for Planning Satisficing Answers
+*Literature review and code review
 |-
-|Shiyue Zhang || 9:00 || 20:00 || 9.5||
+| rowspan="1"|2017/05/06
-*Review the two papers I've read last week and ask questions
+|Aodong Li||14:20 ||17:20||3 ||
-*Read the paper "Recurrent Neural Network Grammars"
+*Code review
 |-
-| rowspan="5"|2016/09/13
+| rowspan="1"|2017/05/07
-|Aodong Li ||  ||   || ||
+|Aodong Li||13:30 ||22:00||8 ||
+*Code review and experiment started, but version discrepancy encountered
 |-
-|Ziwei Bai || 9:20 || 18:20  || 8 ||
+| rowspan="1"|2017/05/08
-*install scrapy & beautifulSoup
+|Aodong Li||11:30 ||21:00 ||8 ||
-*learn scrapy
+*Code review and version discrepancy solved
 |-
-|Andy Zhang || 9:20 || 18:20  ||8  ||
+| rowspan="1"|2017/05/09
-*continue preparing for the paper sharing
+|Aodong Li||13:00 ||22:00 ||9 ||
-*set up env for MemN2N code lua version(for language model)
+*Code review and experiment
-|-A
+*details about experiment:
-|Shiyao Li ||  ||   ||  ||
+  small data,
+st and 2nd translator uses the same training data,
+nd translator uses '''random initialized embedding'''
+*results (BLEU):
+  BASELINE: 43.87
+   best result of our model: 42.56
 |-
-|Shiyue Zhang || 9:00 || 17:30 || 7.5||
+| rowspan="1"|2017/05/10
-*Review the paper "Recurrent Neural Network Grammars" and ask questions
+|Shipan Ren || 9:00 || 20:00 || 11 ||
-*Try to run the code of RNNG
+*Entry procedures
+*Machine Translation paper reading
 |-
-| rowspan="5"|2016/09/14
+| rowspan="1"|2017/05/10
-|Aodong Li ||  ||   || ||
+|Aodong Li || 13:30 || 22:00 || 8 ||
+*experiment setting:
+  small data,
+st and 2nd translator uses the different training data, counting 22000 and 22017 seperately
+nd translator uses '''random initialized embedding'''
+*results (BLEU):
+  BASELINE: 36.67 (36.67 is the model at 4750 updates, but we use model at 3000 updates to
+                     prevent the case of overfitting, to generate the 2nd translator's training data, for
+                     which the BLEU is 34.96)
+  best result of our model: 29.81
+  This may suggest that that using either the same training data with 1st translator or different
+                    one won't influence 2nd translator's performance, instead, using the same one may
+                     be better, at least from results. But I have to give a consideration of a smaller size
+                     of training data compared to yesterday's model.
+*code 2nd translator with constant embedding
 |-
-|Ziwei Bai || 9:20 || 18:00 || 7+ ||try using scrapy download company public announcement from sina(170MB)
+| rowspan="1"|2017/05/11
+|Shipan Ren || 10:00 || 19:30 || 9.5 ||
+*Configure environment
+*Run tf_translate code
+*Read Machine Translation paper
 |-
-|Andy Zhang || 9:20 ||  18:20 || 8 ||
+| rowspan="1"|2017/05/11
-*tried to finish running the source code of MemN2N but it crashed
+|Aodong Li || 13:00 ||  21:00|| 8 ||
-*tried to install torch on the server without root but failed
+*experiment setting:
-*will finish running the source code at home
+  small data,
+st and 2nd translator uses the same training data,
+nd translator uses '''constant untrainable embedding''' imported from 1st translator's decoder
+*results (BLEU):
+  BASELINE: 43.87
+  best result of our model: 43.48
+  Experiments show that this kind of series or cascade model will definitely impair the final perfor-
+                      mance due to information loss as the information flows through the network from
+                      end to end. Decoder's smaller vocabulary size compared to encoder's demonstrate
+                      this (9000+ -> 6000+).
+  The intention of this experiment is looking for a map to solve meaning shift using 2nd translator,
+                      but result of whether the map is learned or not is obscured by the smaller vocab size
+                      phenomenon.
+*literature review on hierarchical machine translation
 |-
-|Shiyao Li || 9:20 || 6:20  ||  8||
+| rowspan="1"|2017/05/12
-*finish the ppt for paper sharing and pre-share it with Leader Feng
+|Aodong Li||13:00 ||21:00 ||8 ||
+*Code double decoding model and read multilingual MT paper
+|-
+| rowspan="1"|2017/05/13
+|Shipan Ren || 10:00 || 19:00 || 9 ||
+*read machine translation paper
+*learne lstm model and seq2seq model
+|-
+| rowspan="1"|2017/05/14
+|Aodong Li || 10:00 || 20:00 || 9 ||
+*Code double decoding model and experiment
+*details about experiment:
+  small data,
+nd translator uses as training data the concat(Chinese, machine translated English),
+nd translator uses '''random initialized embedding'''
+*results (BLEU):
+  BASELINE: 43.87
+  best result of our model: 43.53
+*NEXT: 2nd translator uses '''trained constant embedding'''
+|-
+| rowspan="1"|2017/05/15
+|Shipan Ren || 9:30 || 19:00 || 9.5 ||
+* understand the difference between lstm model and gru model
+* read the implement code of seq2seq model
+|-
+| rowspan="2"|2017/05/17
+|Shipan Ren || 9:30 || 19:30 || 10 ||
+* read neural machine translation paper
+* read tf_translate code
+|-
+|Aodong Li || 13:30 || 24:00 || 9||
+* code and debug double-decoder model
+* alter 2017/05/14 model's size and will try after nips
+|-
+| rowspan="2"|2017/05/18
+|Shipan Ren || 10:00 || 19:00 || 9 ||
+* read neural machine translation paper
+* read tf_translate code
+|-
+|Aodong Li || 12:30 || 21:00 || 8 ||
+* train double-decoder model on small data set but encounter decode bugs
+|-
+| rowspan="1"|2017/05/19
+|Aodong Li || 12:30 || 20:30 || 8 ||
+* debug double-decoder model
+* the model performs well on develop set, but performs badly on test data. I want to figure out the reason.
+|-
+| rowspan="1"|2017/05/21
+|Aodong Li || 10:30 || 18:30 || 8 ||
+*details about experiment:
+  hidden_size = 700 (500 in prior)
+  emb_size = 510 (310 in prior)
+  small data,
+nd translator uses as training data the concat(Chinese, machine translated English),
+nd translator uses '''random initialized embedding'''
+*results (BLEU):
+  BASELINE: 43.87
+  best result of our model: '''45.21'''
+  But only one checkpoint outperforms the baseline, the other results are commonly under 43.1
+* debug double-decoder model
+|-
+| rowspan="1"|2017/05/22
+|Aodong Li || 14:00 || 22:00 || 8 ||
+*double-decoder without joint loss generalizes very bad
+*i'm trying double-decoder model with joint loss
+|-
+| rowspan="1"|2017/05/23
+|Aodong Li || 13:00 || 21:30 || 8 ||
+*details about experiment 1:
+  hidden_size = 700
+  emb_size = 510
+  learning_rate = 0.0005 (0.001 in prior)
+  small data,
+nd translator uses as training data the concat(Chinese, machine translated English),
+nd translator uses '''random initialized embedding'''
+*results (BLEU):
+  BASELINE: 43.87
+  best result of our model: '''42.19'''
+  Overfitting? In overall, the 2nd translator performs worse than baseline
+*details about experiment 2:
+  hidden_size = 500
+  emb_size = 310
+  learning_rate = 0.001
+  small data,
+  double-decoder model with joint loss which means the final loss  = 1st decoder's loss + 2nd
+  decoder's loss
+*results (BLEU):
+  BASELINE: 43.87
+  best result of our model: '''39.04'''
+  The 1st decoder's output is generally better than 2nd decoder's output. The reason may be that
+  the second decoder only learns from the first decoder's hidden states because their states are
+  almost the same.
+*DISCOVERY:
+  The reason why double-decoder without joint loss generalizes very bad is that the gap between
+  force teaching mechanism (training process) and beam search mechanism (decoding process)
+  propagates and expands the error to the output end, which destroys the model when decoding.
+*next:
+  Try to train double-decoder model without joint loss but with beam search on 1st decoder.
+|-
+| rowspan="1"|2017/05/24
+|Aodong Li || 13:00 || 21:30 || 8 ||
+*code double-attention one-decoder model
+*code double-decoder model
 |-
-|Shiyue Zhang ||||  || ||
+| rowspan="1"|2017/05/24
+|Shipan Ren || 10:00 || 20:00 || 10 ||
+*read neural machine translation paper
+*read tf_translate code
 |-
-| rowspan="5"|2016/09/15
-|Aodong Li ||  ||   || ||
+| rowspan="2"|2017/05/25
+|Shipan Ren || 9:30 || 18:30 || 9 ||
+*write document of tf_translate project
+*read neural machine translation paper
+*read tf_translate code
 |-
-|Ziwei Bai || 10:30 || 20:00|| 8.5||
+|Aodong Li || 13:00 || 22:00 || 9 ||
-*try using scrapy download  'data learn' and 'blog' from sina(115MB)
+* code and debug double attention model
-*realize renew automatically everyday
 |-
-|Andy Zhang ||  ||   ||  ||
+| rowspan="1"|2017/05/27
+|Shipan Ren || 9:30 || 18:30 || 9 ||
+*read tf_translate code
+*write document of tf_translate project
 |-
-|Shiyao Li||||  || ||
+| rowspan="1"|2017/05/28
+|Aodong Li || 15:00 || 22:00 || 7 ||
+*details about experiment:
+  hidden_size = 500
+  emb_size = 310
+  learning_rate = 0.001
+  small data,
+nd translator uses as training data both Chinese and machine translated English
+  Chinese and English use different encoders and different attention
+  '''final_attn = attn_1 + attn_2'''
+nd translator uses '''random initialized embedding'''
+*results (BLEU):
+  BASELINE: 43.87
+  when decoding:
+    final_attn = attn_1 + attn_2 best result of our model: '''43.50'''
+    final_attn = 2/3attn_1 + 4/3attn_2 best result of our model: '''41.22'''
+    final_attn = 4/3attn_1 + 2/3attn_2 best result of our model: '''43.58'''
 |-
-|Shiyue Zhang ||||  || ||
+| rowspan="1"|2017/05/30
+|Aodong Li || 15:00 || 21:00 || 6 ||
+*details about experiment 1:
+  hidden_size = 500
+  emb_size = 310
+  learning_rate = 0.001
+  small data,
+nd translator uses as training data both Chinese and machine translated English
+  Chinese and English use different encoders and different attention
+  '''final_attn = 2/3attn_1 + 4/3attn_2'''
+nd translator uses '''random initialized embedding'''
+*results (BLEU):
+  BASELINE: 43.87
+  best result of our model: '''42.36'''
+* details about experiment 2:
+  '''final_attn = 2/3attn_1 + 4/3attn_2'''
+nd translator uses '''constant initialized embedding'''
+*results (BLEU):
+  BASELINE: 43.87
+  best result of our model: '''45.32'''
+* details about experiment 3:
+  '''final_attn = attn_1 + attn_2'''
+nd translator uses '''constant initialized embedding'''
+*results (BLEU):
+  BASELINE: 43.87
+  best result of our model: '''45.41''' and it seems more stable
 |-
-| rowspan="5"|2016/09/17
+| rowspan="2"|2017/05/31
-|Aodong Li ||  ||   || ||
+|Shipan Ren || 10:00 || 19:30 || 9.5 ||
+*run and test tf_translate code
+*write document of tf_translate project
 |-
-|Ziwei Bai ||  || || ||
+|Aodong Li || 12:00 || 20:30 || 8.5 ||
+* details about experiment 1:
+  '''final_attn = 4/3attn_1 + 2/3attn_2'''
+nd translator uses '''constant initialized embedding'''
+*results (BLEU):
+  BASELINE: 43.87
+  best result of our model: '''45.79'''
+* That only make English word embedding at encoder constant and train all the other embedding and parameters achieves an even higher bleu score 45.98 and the results are stable.
+* The quality of English embedding at encoder plays an pivotal role in this model.
+* Preparation of big data.
 |-
-|Andy Zhang ||  ||   ||  ||
+| rowspan="1"|2017/06/01
+|Aodong Li || 13:00 || 24:00 || 11 ||
+* Only make the English encoder's embedding constant -- 45.98
+* Only initialize the English encoder's embedding and then finetune it -- 46.06
+* Share the attention mechanism and then directly add them -- 46.20
+* Run double-attention model on large data
 |-
-|Shiyao Li||||  || ||
+| rowspan="1"|2017/06/02
+|Aodong Li || 13:00 || 22:00 || 9 ||
+* Baseline bleu on large data is 30.83 with '''30000''' output vocab
+* Our best result is 31.53 with '''20000''' output vocab
 |-
-|Shiyue Zhang ||10:00 || 20:00  || 9 ||
+| rowspan="1"|2017/06/03
-* help Lantian, Wangyang and Zhiyuan to check papers
+|Aodong Li || 13:00 || 21:00 || 8 ||
+* Train the model with 40 batch size and with concat(attn_1, attn_2)
+* the best result of model with 40 batch size and with add(attn_1, attn_2) is 30.52
 |-
-| rowspan="5"|2016/09/18
+| rowspan="1"|2017/06/05
-|Aodong Li ||  ||   || ||
+|Aodong Li || 10:00 || 19:00 || 8 ||
+* Prepare for APSIPA paper
 |-
-|Ziwei Bai || 9:30 ||18:30 ||8 ||
+| rowspan="1"|2017/06/06
-*try using scrapy download  'Wall news' & 'zqsb'(400M)
+|Aodong Li || 10:00 || 19:00 || 8 ||
+* Prepare for APSIPA paper
 |-
-|Andy Zhang || 9:30 || 18:30  || 8 ||
+| rowspan="1"|2017/06/07
-*reviewed paper that is to be shared tomorrow
+|Aodong Li || 10:00 || 19:00 || 8 ||
-*tried to set the env for a c++ code but failed
+* Prepare for APSIPA paper
 |-
-|Shiyao Li|| 9:30 || 6:30 || 8 ||
+| rowspan="1"|2017/06/08
-*review the non-parametric model written by Maoning Wang and learn about Latex
+|Aodong Li || 10:00 || 19:00 || 8 ||
-*help Wangyang check his paper's language problems
+* Prepare for APSIPA paper
 |-
-|Shiyue Zhang || 14:30 ||  20:00|| 5 ||
+| rowspan="1"|2017/06/09
-* help Wang Yang and Tang Zhiyuan to check papers
+|Aodong Li || 10:00 || 19:00 || 8 ||
-* try to understand and run the code of RNNG
+* Prepare for APSIPA paper
 |-
-| rowspan="5"|2016/09/19
+| rowspan="1"|2017/06/12
-|Aodong Li ||  ||   || ||
+|Aodong Li || 10:00 || 19:00 || 8 ||
+* Prepare for APSIPA paper
 |-
-|Ziwei Bai || 9:20 ||18:30 ||8 ||
+| rowspan="1"|2017/06/13
-*continue download news(2G)
+|Aodong Li || 10:00 || 19:00 || 8 ||
+* Prepare for APSIPA paper
 |-
-|Andy Zhang || 9:20 || 18:30  || 8 ||
+| rowspan="1"|2017/06/14
-*set the env for rnng code
+|Aodong Li || 10:00 || 19:00 || 8 ||
-*shared a paper
+* Prepare for APSIPA paper
 |-
-|Shiyao Li|| 9:20 || 18:30 || 8 ||
+| rowspan="1"|2017/06/15
-*share a paper
+|Aodong Li || 10:00 || 19:00 || 8 ||
-*keep understanding what Maoning Wang has written about Chapter 8
+* Prepare for APSIPA paper
+* Read paper about MT involving grammar
 |-
-|Shiyue Zhang || 9:00 || 19:30 || 9 ||
+| rowspan="1"|2017/06/16
-* Try to compile the rnng code, ask Zhiyong for help
+|Aodong Li || 10:00 || 19:00 || 8 ||
+* Prepare for APSIPA paper
+* Read paper about MT involving grammar
 |-
-| rowspan="5"|2016/09/20
+| rowspan="1"|2017/06/19
-|Aodong Li ||  ||   || ||
+|Aodong Li || 10:00 || 19:00 || 8 ||
+* Completed APSIPA paper
+* Took new task in style translation
 |-
-|Ziwei Bai || 9:10 || 15:40|| 5.5||
+| rowspan="1"|2017/06/20
-*modify the spider
+|Aodong Li || 10:00 || 19:00 || 8 ||
+* Tried synonyms substitution
 |-
-|Andy Zhang || 9:10 || 18:10  || 8||
+| rowspan="1"|2017/06/21
-*continue work on the env setting of rnng, had small progress, aborted it
+|Aodong Li || 10:00 || 19:00 || 8 ||
-*find several open sources of NTM
+* Tried post edit like synonyms substitution but this didn't work
-*read the source code of MemN2N in lua(torch)
 |-
-|Shiyao Li||  ||  ||  ||
+| rowspan="1"|2017/06/22
+|Aodong Li || 10:00 || 19:00 || 8 ||
+* Trained a GRU language model to determine similar word
 |-
-|Shiyue Zhang || 14:00 || 19:30 || 5.5 ||
+| rowspan="2"|2017/06/23
-* Compile the code successfully
+|Shipan Ren || 10:00 || 21:00 || 11 ||
-* Try to run the code
+* read neural machine translation paper
+* read and run tf_translate code
 |-
-| rowspan="5"|2016/09/21
+|Aodong Li || 10:00 || 19:00 || 8 ||
-|Aodong Li ||  ||   || ||
+* Trained a GRU language model to determine similar word
+* This didn't work because semantics is not captured
 |-
-|Ziwei Bai || 13:00 ||22:00 ||8 ||
+| rowspan="2"|2017/06/26
-*spider QA(failed in zhihu)
+|Shipan Ren || 10:00 || 21:00 || 11 ||
+* read paper：LSTM Neural Networks for Language Modeling
+* read and run ViVi_NMT code
 |-
-|Andy Zhang || 9:10 || 12:20  ||3 ||
+|Aodong Li || 10:00 || 19:00 || 8 ||
-*read the code of MemN2N in lua(torch)
+* Tried to figure out new ways to change the text style
 |-
-|Shiyao Li||  ||  ||  ||
+| rowspan="2"|2017/06/27
+|Shipan Ren || 10:00 || 20:00 || 10 ||
+* read the API of tensorflow
+* debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0
 |-
-|Shiyue Zhang || 9:30 || 19:30 || 8.5 ||
+|Aodong Li || 10:00 || 19:00 || 8 ||
-* Run the code sucessfully using a fragment of data
+* Trained seq2seq model to solve this problem
-* Try to format the whole dataset
+* Semantics are stored in fixed-length vectors by a encoder and a decoder generate sequences on this vector
 |-
-| rowspan="5"|2016/09/22
+| rowspan="2"|2017/06/28
-|Aodong Li ||  ||   || ||
+|Shipan Ren || 10:00 || 19:00 || 9 ||
+* debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
+* installed tensorflow0.1 and tensorflow1.0 on my pc and debugged ViVi_NMT
 |-
-|Ziwei Bai || 10:20 || 22:20 ||11 ||
+|Aodong Li || 10:00 || 19:00 || 8 ||
-*sprder QA
+* Cross-domain seq2seq w/o attention and w/ attention models didn't work because of overfitting
 |-
-|Andy Zhang || 9:30 || 18:30  ||8 ||read source code of end2end memory network
+| rowspan="2"|2017/06/29
+|Shipan Ren || 10:00 || 20:00 || 10 ||
+* read the API of tensorflow
+* debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
 |-
-|Shiyao Li||  ||  ||  ||
+|Aodong Li || 10:00 || 19:00 || 8 ||
+* Read style transfer papers
 |-
-|Shiyue Zhang || 9:20 || 20:20 || 9 ||
+| rowspan="2"|2017/06/30
-*format the full version of data
+|Shipan Ren || 10:00 || 24:00 || 14 ||
-*debug the discriminative model code
+* debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
+* accomplished this task
+* found the new version saves more time，has lower complexity and better bleu than before
 |-
-| rowspan="5"|2016/09/23
+|Aodong Li || 10:00 || 19:00 || 8 ||
-|Aodong Li ||  ||   || ||
+* Read style transfer papers
 |-
-|Ziwei Bai || 9:30 ||23:30 ||13||
+| rowspan="1"|2017/07/03
-*program script to test the chatting model
+|Shipan Ren || 9:00 || 21:00 || 12 ||
-*spider QA from tianya (unsolved question)
+* run two versions of the code on small data sets (Chinese-English)
-*spider QA from zhihu.sogou.com(still run, but slow)
+* tested these checkpoint
-*filter the deteset for word vector
 |-
-|Andy Zhang || 9:20 || 18:20  ||8 ||
+| rowspan="1"|2017/07/04
-*managed to understand the source code
+|Shipan Ren || 9:00 || 21:00 || 12 ||
+* recorded experimental results
+* found version 1.0 of the code save more training time, has less complexity and these two version of the code has a similar Bleu value
+* found that the Bleu is still good when the model is over fitting
+* reason: the test set and training set are similar in content and style on small data set
 |-
-|Shiyao Li||  ||  ||  ||
+| rowspan="1"|2017/07/05
+|Shipan Ren || 9:00 || 21:00 || 12 ||
+* run two versions of the code on big data sets (Chinese-English)
+* read NMT papers
 |-
-|Shiyue Zhang || 14:30 || 20:00 || 5.5||
+| rowspan="1"|2017/07/06
-*run discriminative model, find there is one step missing, and ask author for missing code
+|Shipan Ren || 9:00 || 21:00 || 12 ||
-*run generative model successfully, but need to run again on newly formated data
+* out of memory（OOM） error occurred when version 0.1 of code was trained using large data set，but version 1.0 worked
-*reformat the data, stemming the nonterminal tokens
+* reason: improper distribution of resources by the tensorflow0.1 version leads to exhaustion of memory resources
+* I've tried many times, and version 0.1 worked
 |-
-| rowspan="5"|2016/09/24
+| rowspan="1"|2017/07/07
-|Aodong Li ||  ||   || ||
+|Shipan Ren || 9:00 || 21:00 || 12 ||
+* tested these checkpoints and recorded experimental results
+* the version 1.0 code saved 0.06 second per step than the version 0.1 code
 |-
-|Ziwei Bai || 12:30 ||20:30 ||7||
+| rowspan="1"|2017/07/08
+|Shipan Ren || 9:00 || 21:00 || 12 ||
+* downloaded the wmt2014 data set
+* used the English-French data set to run the code and found the translation is not good
+* reason:no data preprocessing is done
 |-
-|Andy Zhang ||  ||   || ||
+| rowspan="1"|2017/07/10
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
+* dataset：zh-en small
 |-
-|Shiyao Li||  ||  ||  ||
+| rowspan="1"|2017/07/11
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* tested these checkpoints
+* found the new version takes less time
+* found these two versions have similar complexity and bleu values
+* found that the bleu is still good when the model is over fitting .
+* (reason: the test set and the train set of small data set are similar in content and style)
 |-
-|Shiyue Zhang ||  || ||||
+| rowspan="1"|2017/07/12
-|}
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
+* dataset：zh-en big
-===Monthly Summary===
+|-
+| rowspan="1"|2017/07/13
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* OOM（Out Of Memory） error occurred when version 0.1 was trained using large data set，but version 1.0 worked
+    reason: improper distribution of resources by the tensorflow0.1 frame leads to exhaustion of memory resources
+* I had tried 4 times （just enter the same command）, and version 0.1 worked
-{| class="wikitable"
-!People!! Summary
 |-
-|Yang Feng ||
+| rowspan="1"|2017/07/14
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* tested these checkpoints
+* found the new version takes less time
+* found these two versions have similar complexity and bleu values
 |-
-|Jiyuan Zhang ||
+| rowspan="1"|2017/07/17
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* downloaded the wmt2014 data sets and processed it
 |-
-|Aodong Li ||
+| rowspan="1"|2017/07/18
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* processed data
 |-
-|Ziwei Bai ||
+| rowspan="1"|2017/07/18
+|Jiayu Guo || 8:30|| 22:00 || 14 ||
+* read model code.
 |-
-|Andy Zhang ||
+| rowspan="1"|2017/07/19
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* processed data
 |-
-|Shiyao Li ||
+| rowspan="1"|2017/07/19
+|Jiayu Guo || 9:00|| 22:00 || 13 ||
+* read papers of bleu.
 |-
-|Shiyue Zhang ||
+| rowspan="1"|2017/07/20
-|}
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* processed data
+|-
+| rowspan="1"|2017/07/20
+|Jiayu Guo || 9:00|| 22:00 || 13 ||
+* read papers of attention mechanism.
+|-
+| rowspan="1"|2017/07/21
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
+* dataset:WMT2014 en-de
+|-
+| rowspan="1"|2017/07/21
+|Jiayu Guo || 10:00|| 23:00 || 13 ||
+* process document
+|-
+| rowspan="1"|2017/07/24
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* tested these checkpoints of en-de dataset
+* found the new version takes less time
+* found these two versions have similar complexity and bleu values
+|-
+| rowspan="1"|2017/07/24
+|Jiayu Guo || 9:00|| 22:00 || 13 ||
+* read model code.
+|-
+| rowspan="1"|2017/07/25
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
+* dataset:WMT2014 en-fr datasets
+|-
+| rowspan="1"|2017/07/25
+|Jiayu Guo || 9:00|| 23:00 || 14 ||
+* process document
+|-
+| rowspan="1"|2017/07/26
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* read papers about memory-augmented nmt
+|-
+| rowspan="1"|2017/07/26
+|Jiayu Guo || 10:00|| 24:00 || 14 ||
+* process document
+|-
+| rowspan="1"|2017/07/27
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* read papers about memory-augmented nmt
+|-
+| rowspan="1"|2017/07/27
+|Jiayu Guo || 10:00|| 24:00 || 14 ||
+* process document
+|-
+| rowspan="1"|2017/07/28
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* read memory-augmented nmt code
+|-
+| rowspan="1"|2017/07/28
+|Jiayu Guo || 9:00|| 24:00 || 15 ||
+* process document
+|
+|-
+| rowspan="1"|2017/07/31
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* read memory-augmented nmt code
+|-
+| rowspan="1"|2017/07/31
+|Jiayu Guo || 10:00|| 23:00 || 13 ||
+* split ancient language text to single word
+|
+|-
+| rowspan="1"|2017/08/1
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* tested these checkpoints of en-fr dataset
+* found the new version takes less time
+* found these two versions have similar complexity and bleu values
+|-
+| rowspan="1"|2017/08/1
+|Jiayu Guo || 10:00|| 23:00 || 13 ||
+* run seq2seq_model
+|
+|-
+| rowspan="1"|2017/08/2
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* looked for the performance(the bleu value) of other models
+* datasets:WMT2014 en-de and en-fr
+|-
+| rowspan="1"|2017/08/2
+|Jiayu Guo || 10:00|| 23:00 || 13 ||
+* process document
+|-
+| rowspan="1"|2017/08/3
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* looked for the performance(the bleu value) of other seq2seq models
+* datasets:WMT2014 en-de and en-fr
+|-
+| rowspan="1"|2017/08/3
+|Jiayu Guo || 10:00|| 23:00 || 13 ||
+* process document
+|-
+| rowspan="1"|2017/08/4
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* learn moses
+|-
+| rowspan="1"|2017/08/4
+|Jiayu Guo || 10:00|| 23:00 || 13 ||
+* search new data(Songshu)
+|-
+| rowspan="1"|2017/08/7
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* installed and built Moses on the server
+|-
+| rowspan="1"|2017/08/7
+|Jiayu Guo || 9:00|| 22:00 || 13 ||
+* process document
+|-
+| rowspan="1"|2017/08/8
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* train statistical machine translation model and test it
+* dataset:zh-en small
+* test if moses can work normally
+|-
+| rowspan="1"|2017/08/8
+|Jiayu Guo || 10:00|| 21:00 || 11 ||
+* read tensorflow
+|-
+| rowspan="1"|2017/08/9
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* code automation scripts to process data,train model and test model
+* toolkit: Moses
+|-
+| rowspan="1"|2017/08/9
+|Jiayu Guo || 10:00|| 23:00 || 13 ||
+* run model with the data of which ancient content was split by single character.
+|-
+| rowspan="1"|2017/08/10
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* train statistical machine translation models and test it
+* dataset:zh-en big,WMT2014 en-de,WMT2014 en-fr
+|-
+| rowspan="1"|2017/08/10
+|Jiayu Guo || 9:00|| 23:00 || 13 ||
+* process data of Songshu
+* read papers of CNN
+|-
+| rowspan="1"|2017/08/11
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* collate experimental results
+* compare our baseline model with Moses
+|-
+| rowspan="1"|2017/08/11
+|Jiayu Guo || 9:00|| 20:00 || 11 ||
+* test results.
+|-
+| rowspan="1"|2017/08/14
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* read paper about THUMT
+|-
+| rowspan="1"|2017/08/14
+|Jiayu Guo || 10:00|| 23:00 || 13 ||
+* learn about Graphic Model of LSTM-Projected BPTT
+* search for data available for translation (Twenty-four-Shi)
+|-
+| rowspan="1"|2017/08/15
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* read THUMT manual and learn how to use it
+|-
+| rowspan="1"|2017/08/15
+|Jiayu Guo || 11:00|| 23:30 || 12 ||
+* run model with data including Shiji、Zizhitongjian.
+|-
+| rowspan="1"|2017/08/16
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* train translation models and test them
+* toolkit: THUMT
+* dataset:zh-en small
+* test if THUMT can work normally
+|-
+| rowspan="1"|2017/08/16
+|Jiayu Guo || 10:00|| 23:00 || 10||
+checkpoint-100000 translation model
+BLEU： 11.11
+*source:在秦者名错，与张仪争论,於是惠王使错将伐蜀，遂拔，因而守之。
+*target:在秦国的名叫司马错，曾与张仪发生争论，秦惠王采纳了他的意见，于是司马错率军攻蜀国，攻取后，又让他做了蜀地郡守。
+*trans：当时秦国的人都很欣赏他的建议，与张仪一起商议，所以吴王派使者率军攻打蜀地，一举攻，接着又下令守城 。
+*source:神大用则竭，形大劳则敝，形神离则死 。
+*target:精神过度使用就会衰竭，形体过度劳累就会疲惫，神形分离就会死亡。
+*trans: 精神过度就可衰竭,身体过度劳累就会疲惫，地形也就会死。
+*source:今天子接千岁之统，封泰山，而余不得从行，是命也夫，命也夫！
+*target:现天子继承汉朝千年一统的大业，在泰山举行封禅典礼而我不能随行，这是命啊，是命啊！
+*trans: 现在天子可以继承帝位的成就爵位，爵位至泰山，而我却未能执行先帝的命运。
+*1.data used Zizhitongjian only(6,000 pairs), we can get BLEU 6 at most.
+*2.data used Zizhitongjian only(12,000 pairs), we can get BLEU 7 at most.
+*3.data used Shiji and Zizhitongjian(43,0000 pairs), we can get BLEU about 9.
+*4.data used Shiji and Zizhitongjian(43,0000 pairs), and split the ancient language text one character by one, we can get BLEU 11.11 at most.
+|-
+| rowspan="1"|2017/08/17
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* code automation scripts to process data,train model and test model
+* train translation models and test them
+* toolkit: THUMT
+* dataset:zh-en big
+|-
+| rowspan="1"|2017/08/17
+|Jiayu Guo || 13:00|| 23:00 || 10 ||
+* read source code.
+|-
+| rowspan="1"|2017/08/18
+|Shipan Ren || 9:00 || 20:00 || 11 ||
+* test translation models by using single reference and  multiple reference
+* organize all the experimental results(our baseline system,Moses,THUMT)
+|-
+| rowspan="1"|2017/08/18
+|Jiayu Guo || 13:00|| 22:00 || 9 ||
+* read source code.
+|-
+| rowspan="1"|2017/08/21
+|Shipan Ren || 10:00 || 22:00 || 12 ||
+* read the released information of other translation systems
+|-
+| rowspan="1"|2017/08/21
+|Jiayu Guo || 9:30 || 21:30 || 12 ||
+* read the source code and learn tensorflow
+|-
+| rowspan="1"|2017/08/22
+|Shipan Ren || 10:00 || 22:00 || 12 ||
+* cleaned up the code
+|-
+| rowspan="1"|2017/08/22
+|Jiayu Guo || 9:00 || 22:00 || 12 ||
+* read the source code
+|-
+| rowspan="1"|2017/08/23
+|Shipan Ren || 10:00 || 21:00 || 11 ||
+* wrote the documents
+|-
+| rowspan="1"|2017/08/23
+|Jiayu Guo || 9:00 || 22:00 || 11 ||
+* read the source code and learn tensorflow
+|-
+| rowspan="1"|2017/08/24
+|Shipan Ren || 10:00 || 20:00 || 10 ||
+* wrote the documents
+|-
+| rowspan="1"|2017/08/24
+|Jiayu Guo || 9:10 || 22:00 || 10.5 ||
+* read the source code and learn tensorflow
+|-
+| rowspan="1"|2017/08/25
+|Shipan Ren || 10:00 || 20:00 || 10 ||
+* check experimental results
+|-
+| rowspan="1"|2017/08/25
+|Jiayu Guo || 8:50 || 22:00 || 10.5 ||
+* read the source code and learn tensorflow
+|-
+| rowspan="1"|2017/08/28
+|Shipan Ren || 10:00 || 20:00 || 10 ||
+* wrote the paper of ViVi_NMT(version 1.0)
+|-
+| rowspan="1"|2017/08/28
+|Jiayu Guo || 8:10 || 21:00 || 11 ||
+* read the source code and learn tensorflow
+|-
+| rowspan="1"|2017/08/29
+|Shipan Ren || 10:00 || 20:00 || 10 ||
+* wrote the paper of ViVi_NMT(version 1.0)
+|-
+| rowspan="1"|2017/08/29
+|Jiayu Guo || 11:00 || 21:00 || 10 ||
+* read the source code and learn tensorflow
+|-
+| rowspan="1"|2017/08/30
+|Shipan Ren || 10:00 || 20:00 || 10 ||
+* wrote the paper of ViVi_NMT(version 1.0)
+|-
+| rowspan="1"|2017/08/30
+|Jiayu Guo || 11:30 || 21:00 || 9 ||
+* learn VV model
+|-
+| rowspan="1"|2017/08/31
+|Shipan Ren || 10:00 || 20:00 || 10 ||
+* wrote the paper of ViVi_NMT(version 1.0)
+|-
+| rowspan="1"|2017/08/31
+|Jiayu Guo || 10:00 || 20:00 || 10 ||
+* clean up the code
+|-
+}
 ===Time Off Table===
 {| class="wikitable"
-! Date !! Yang Feng !! Jiyuan Zhang !! Aodong Li !! Ziwei Bai !! Andy Zhang !! Shiyao Li !! Shiyue Zhang
+! Date !! Yang Feng !! Jiyuan Zhang
 |-
-|2016/09/08 || 8h || ||  ||  || ||  ||
 |}
 ==Past progress==
+[[nlp-progress 2017/03]]
+[[nlp-progress 2017/02]]
+[[nlp-progress 2017/01]]
+[[nlp-progress 2016/12]]
+[[nlp-progress 2016/11]]
+[[nlp-progress 2016/10]]
+[[nlp-progress 2016/09]]
 [[nlp-progress 2016/08]]