|
|
(相同用户的57个中间修订版本未显示) |
第1行: |
第1行: |
− | ==Paper Recommendation==
| + | =Chaos Work= |
− | Pre-Trained Multi-View Word Embedding.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/3/3c/Pre-Trained_Multi-View_Word_Embedding.pdf]
| + | [[SLT]] |
− | | + | |
− | Learning Word Representation Considering Proximity and Ambiguity.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b0/Learning_Word_Representation_Considering_Proximity_and_Ambiguity.pdf]
| + | |
− | | + | |
− | Continuous Distributed Representations of Words as Input of LSTM Network Language Model.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/5/5a/Continuous_Distributed_Representations_of_Words.pdf]
| + | |
− | | + | |
− | WikiRelate! Computing Semantic Relatedness Using Wikipedia.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/c/cb/WikiRelate%21_Computing_Semantic_Relatedness_Using_Wikipedia.pdf]
| + | |
− | | + | |
− | Japanese-Spanish Thesaurus Construction Using English as a Pivot[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/e/e8/Japanese-Spanish_Thesaurus_Construction.pdf]
| + | |
− | | + | |
− | ==Chaos Work==
| + | |
− | | + | |
− | ===SSA Model===
| + | |
− | | + | |
− | Build 2-dimension SSA-Model.
| + | |
− | Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result is :
| + | |
− | 27.83% 46.53% 2 classify
| + | |
− | Test 25,50-dimension SSA-Model for transform
| + | |
− | Start at : 2014-10-02 <--> End at : 2014-10-03 <--> Result is :
| + | |
− | 11.96% 27.43% 50 classify
| + | |
− | Test All-Belong SSA model for transform
| + | |
− | Start at : 2014-10-02
| + | |
− | | + | |
− | ===SEMPRE Research===
| + | |
− | ====Work Schedule ====
| + | |
− | Download SEMPRE toolkit.
| + | |
− | Start at : 2014-09-30
| + | |
− | | + | |
− | ====Paper related====
| + | |
− | Semantic Parsing via Paraphrasing [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/8/85/Semantic_Parsing_via_Paraphrasing.pdf]
| + | |
− | | + | |
− | ===Knowledge Vector===
| + | |
− | | + | |
− | Pre-process corpus.
| + | |
− | Start at : 2014-09-30.
| + | |
− | Use toolkit Wikipedia_Extractor [http://medialab.di.unipi.it/wiki/Wikipedia_Extractor] waiting
| + | |
− | End at : 2014-10-03 Result :
| + | |
− | Original corpus is about 47G and after preprocessing the corpus is almost 17.8G
| + | |
− | Analysis corpus, and training word2vec by wikipedia.
| + | |
− | Start at : 2014-10-03.
| + | |
− | | + | |
− | ===Moses translation model===
| + | |
− | | + | |
− | Pre-process corpus, remove the sentence which contains rarely seen words.
| + | |
− | Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result :
| + | |
− | Original lines is 8973724, Clean corpus (remove sentences which contain words less than 10) is 6033397
| + | |
− | Train Model.
| + | |
− | Start at : 2014-10-02 <--> End at : 2014-10-05
| + | |
− | Tuning Model.
| + | |
− | Start at : 2014-10-05
| + | |
− |
| + | |
− | | + | |
− | ===Non Linear Transform Testing===
| + | |
− | ====Work Schedule====
| + | |
− | Re-train best mse for test data.
| + | |
− | Start at : 2014-10-01 <--> End at : 2014-10-02 <--> Result :
| + | |
− | Performance is inconsistent to expectations. Best result for Non-Linear is 1e-2.
| + | |