“Xingchao work”版本间的差异
来自cslt Wiki
第15行: | 第15行: | ||
Build 2-dimension SSA-Model. | Build 2-dimension SSA-Model. | ||
− | + | Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result is : | |
+ | 27.83% 46.53% 2 classify | ||
Test 25,50-dimension SSA-Model for transform | Test 25,50-dimension SSA-Model for transform | ||
− | + | Start at : 2014-10-02 <--> End at : 2014-10-03 <--> Result is : | |
+ | 11.96% 27.43% 50 classify | ||
Test All-Belong SSA model for transform | Test All-Belong SSA model for transform | ||
− | + | Start at : 2014-10-02 | |
===SEMPRE Research=== | ===SEMPRE Research=== | ||
第32行: | 第34行: | ||
Pre-process corpus. | Pre-process corpus. | ||
− | + | Start at : 2014-09-30. | |
− | + | Use toolkit Wikipedia_Extractor [http://medialab.di.unipi.it/wiki/Wikipedia_Extractor] waiting | |
+ | End at : 2014-10-03 Result : | ||
+ | Original corpus is about 47G and after preprocessing the corpus is almost 17.8G | ||
+ | Analysis corpus, and training word2vec by wikipedia. | ||
+ | Start at : 2014-10-03. | ||
===Moses translation model=== | ===Moses translation model=== |
2014年10月3日 (五) 04:56的版本
目录
Paper Recommendation
Pre-Trained Multi-View Word Embedding.[1]
Learning Word Representation Considering Proximity and Ambiguity.[2]
Continuous Distributed Representations of Words as Input of LSTM Network Language Model.[3]
WikiRelate! Computing Semantic Relatedness Using Wikipedia.[4]
Japanese-Spanish Thesaurus Construction Using English as a Pivot[5]
Chaos Work
SSA Model
Build 2-dimension SSA-Model. Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result is : 27.83% 46.53% 2 classify Test 25,50-dimension SSA-Model for transform Start at : 2014-10-02 <--> End at : 2014-10-03 <--> Result is : 11.96% 27.43% 50 classify Test All-Belong SSA model for transform Start at : 2014-10-02
SEMPRE Research
Work Schedule
Download SEMPRE toolkit. Start at : 2014-09-30
Semantic Parsing via Paraphrasing [6]
Knowledge Vector
Pre-process corpus. Start at : 2014-09-30. Use toolkit Wikipedia_Extractor [7] waiting End at : 2014-10-03 Result : Original corpus is about 47G and after preprocessing the corpus is almost 17.8G Analysis corpus, and training word2vec by wikipedia. Start at : 2014-10-03.
Moses translation model
Pre-process corpus, remove the sentence which contains rarely seen words. Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result : Original lines is 8973724, Clean corpus (remove sentences which contain words less than 10) is 6415723 Train Model. Start at : 2014-10-02
Non Linear Transform Testing
Work Schedule
Re-train best mse for test data. Start at : 2014-10-01 <--> End at : 2014-10-02 <--> Result : Performance is inconsistent to expectations. Best result for Non-Linear is 1e-2.