“Hulan-2014-11-06”版本间的差异

来自cslt Wiki

跳转至：导航、搜索

2014年11月6日 (四) 08:40的版本

目录

1 Dialog system

Dialog system

Algorithm

Spell mistake

retrain the ngram model

improve lucene search

our vsm method

different result in lucene
method	lucene	vsm_idf(haiguan)	VSM_idf(baidu)	vsm_idf(tain)	vsm_idf(calculate)
Accary	0.6628	0.6228	0.6197	0.5827	0.5426

lucene top

top10(82.95%),top20(86.34),top50(90.23%),top100(94.11%),top200(96.18%),top1000(97.31%),top2000(97.87%),top5000(98.75%),top10000(99.06)

lucene Optimization(liurong)

rewrite the method to select the 50 standard question not same template.
check the word segment for template.
boost the query keyword using IDF

boost keyword in lucene
method	Default	idf_train	idf_train_norm	idf_baidu	idf_baidu_norm
Accary	0.66228	0.651629	0.57644	0.647869	0.65288

TFIDF Formula

coord(q,d)*query_boost*query_norm*sum(idf^2 * tf * term_boost * norm(t,d)) [1]

add the new keyword value from proMe method

Multi-Scene Recognition

knowledge structure

structure the default answer using attributes of the entity.

Knowledge Management and labeling system

prepare the interface and function.

plan to do

plan to discuss

add the triples search to QA engine

取自“http://cslt.org/mediawiki/index.php?title=Hulan-2014-11-06&oldid=12325”