Jt-chinese

来自cslt Wiki
跳转至: 导航搜索

data and model

  • train
  • size: 62M
  • 8k-sentence from jt(about dianxin)
  • dev
  • 1000 row from train data
  • dict
  • chn_150576.txt(15w)
  • model
Train Set Environment
Parameters hidden class direct bbt bptt_block threads direct-order rand_seed nwords time(min) iter
set1 320 300 2000 5 20 1 3 1 10000 (31h) 8
  • ppl
  • dev:86-66(ppl)
  • learning rate
  • 0.1-0.00625

sampling data from rnnlm

  • different size of simpling data
  • socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35
different size of simple data
size mix0 mix0.3 mix0.5 mix0.7 time
50M 105.457 86.7 87.5 89.7 0.5h
100M 96.13 86.71 87.19 88.98 1.5h
150 103.95 86.59 86.93 88.46 2h
200M 92.99 86.54 86.79 88.16 2.5h
250M 92.44 86.55 86.77 88.07 3h
300M 101.898 86.50 86.66 87.85 3.5h
350M 98.8898 86.417 86.52 87.63 4h
500M 98.21 86.17 86.119 86.99 6h
1000M 87.226 85.83 85.54 86.10 10h