140915 - Xiaoxi Wang
来自cslt Wiki
Last week:
Fixed a minor bug in lm v2.1.4 & v2.1.5
Extracted corpora from baiduzhidao using cross entropy and trained a lm
build a small (90k) vocab for lm, the performance will be tested later (some tasks are assigned to Zhang Chengcheng)
Also a new lm with more training data from baiduzhidao is ready for test.
This week:
Fix the bug that lm v2.1.x cannot decode most of English words (e.g. wifi, modem) correctly.
Add weibo data to training corpora and train more LMs