Text-2014-08-21
来自cslt Wiki
GloVe:
1. 文中所包含的word vector: a) Skip-gram b) CBOW ==> Both can find in word2vec c) vLBL d) ivLBL ==> Both can find in the paper Learning word embeddings efficiently with noise-contrastive estimation. e) HPCA ==> which can find in the paper Word Embeddings through Hellinger PCA. 2. 不同的task: a) Word analogies. b) Word similarity. ==> 评价集合:WordSim-353、MC、RG、SCWS、RW c) Named entity recognition. ==> 评价集合:CoNLL-2003, ACE Phase 2,ACE-2003. 3. 需要做的工作: a) 寻找不同的task b) 比较各种word vector的性能
捷通反馈:
1. 在仅仅用 Lucene 做 extraction进行算法匹配的情况下,有足够多的模板能够达到85%以上的准确率。
Build Reading List模块
recorded by Chao Xing