“Jt-chinese”版本间的差异
来自cslt Wiki
(→sample data from rnnlm) |
(→sample data from rnnlm) |
||
| 第22行: | 第22行: | ||
:* 0.1-0.00625 | :* 0.1-0.00625 | ||
| − | = | + | =sampling data from rnnlm= |
| − | * different size of | + | * different size of simpling data |
:* socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35 | :* socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35 | ||
{| border="2px" | {| border="2px" | ||
2014年12月1日 (一) 06:58的最后版本
data and model
- train
- size: 62M
- 8k-sentence from jt(about dianxin)
- dev
- 1000 row from train data
- dict
- chn_150576.txt(15w)
- model
| Parameters | hidden | class | direct | bbt | bptt_block | threads | direct-order | rand_seed | nwords | time(min) | iter |
|---|---|---|---|---|---|---|---|---|---|---|---|
| set1 | 320 | 300 | 2000 | 5 | 20 | 1 | 3 | 1 | 10000 | (31h) | 8 |
- ppl
- dev:86-66(ppl)
- learning rate
- 0.1-0.00625
sampling data from rnnlm
- different size of simpling data
- socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35
| size | mix0 | mix0.3 | mix0.5 | mix0.7 | time |
|---|---|---|---|---|---|
| 50M | 105.457 | 86.7 | 87.5 | 89.7 | 0.5h |
| 100M | 96.13 | 86.71 | 87.19 | 88.98 | 1.5h |
| 150 | 103.95 | 86.59 | 86.93 | 88.46 | 2h |
| 200M | 92.99 | 86.54 | 86.79 | 88.16 | 2.5h |
| 250M | 92.44 | 86.55 | 86.77 | 88.07 | 3h |
| 300M | 101.898 | 86.50 | 86.66 | 87.85 | 3.5h |
| 350M | 98.8898 | 86.417 | 86.52 | 87.63 | 4h |
| 500M | 98.21 | 86.17 | 86.119 | 86.99 | 6h |
| 1000M | 87.226 | 85.83 | 85.54 | 86.10 | 10h |