“Jt-chinese”版本间的差异
来自cslt Wiki
(→sample data from rnnlm) |
(→sample data from rnnlm) |
||
| (相同用户的3个中间修订版本未显示) | |||
| 第22行: | 第22行: | ||
:* 0.1-0.00625 | :* 0.1-0.00625 | ||
| − | = | + | =sampling data from rnnlm= |
| − | * different size of | + | * different size of simpling data |
| − | :* | + | :* socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35 |
{| border="2px" | {| border="2px" | ||
|+ different size of simple data | |+ different size of simple data | ||
|- | |- | ||
| − | ! size !! mix0 !! mix0.3 !! mix0.5 !! mix0.7 | + | ! size !! mix0 !! mix0.3 !! mix0.5 !! mix0.7 !! time |
|- | |- | ||
!50M | !50M | ||
| − | | 105.457 || 86.7 || 87.5 || 89.7 | + | | 105.457 || 86.7 || 87.5 || 89.7 ||0.5h |
|- | |- | ||
!100M | !100M | ||
| − | |96.13 ||86.71||87.19||88.98 | + | |96.13 ||86.71||87.19||88.98||1.5h |
|- | |- | ||
!150 | !150 | ||
| − | |103.95||86.59||86.93||88.46 | + | |103.95||86.59||86.93||88.46||2h |
|- | |- | ||
!200M | !200M | ||
| − | |92.99||86.54||86.79||88.16 | + | |92.99||86.54||86.79||88.16||2.5h |
|- | |- | ||
!250M | !250M | ||
| − | |92.44||86.55||86.77||88.07 | + | |92.44||86.55||86.77||88.07||3h |
|- | |- | ||
!300M | !300M | ||
| − | |101.898||86.50||86.66||87.85 | + | |101.898||86.50||86.66||87.85||3.5h |
|- | |- | ||
!350M | !350M | ||
| − | |98.8898||86.417||86.52||87.63 | + | |98.8898||86.417||86.52||87.63||4h |
|- | |- | ||
!500M | !500M | ||
| − | |98.21||86.17||86.119||86.99 | + | |98.21||86.17||86.119||86.99||6h |
|- | |- | ||
!1000M | !1000M | ||
| − | |87.226||85.83||85.54||86.10 | + | |87.226||85.83||85.54||86.10||10h |
|- | |- | ||
|} | |} | ||
2014年12月1日 (一) 06:58的最后版本
data and model
- train
- size: 62M
- 8k-sentence from jt(about dianxin)
- dev
- 1000 row from train data
- dict
- chn_150576.txt(15w)
- model
| Parameters | hidden | class | direct | bbt | bptt_block | threads | direct-order | rand_seed | nwords | time(min) | iter |
|---|---|---|---|---|---|---|---|---|---|---|---|
| set1 | 320 | 300 | 2000 | 5 | 20 | 1 | 3 | 1 | 10000 | (31h) | 8 |
- ppl
- dev:86-66(ppl)
- learning rate
- 0.1-0.00625
sampling data from rnnlm
- different size of simpling data
- socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35
| size | mix0 | mix0.3 | mix0.5 | mix0.7 | time |
|---|---|---|---|---|---|
| 50M | 105.457 | 86.7 | 87.5 | 89.7 | 0.5h |
| 100M | 96.13 | 86.71 | 87.19 | 88.98 | 1.5h |
| 150 | 103.95 | 86.59 | 86.93 | 88.46 | 2h |
| 200M | 92.99 | 86.54 | 86.79 | 88.16 | 2.5h |
| 250M | 92.44 | 86.55 | 86.77 | 88.07 | 3h |
| 300M | 101.898 | 86.50 | 86.66 | 87.85 | 3.5h |
| 350M | 98.8898 | 86.417 | 86.52 | 87.63 | 4h |
| 500M | 98.21 | 86.17 | 86.119 | 86.99 | 6h |
| 1000M | 87.226 | 85.83 | 85.54 | 86.10 | 10h |