“Gigabye LM”版本间的差异
(以内容“1. very initial, without any prunning, character based. Here is the size and perplexity. 3gram: 47M 3gram.500.gz:0 zeroprobs, logprob= -6.34868e+06 ppl= 85.1361 ppl1...”创建新页面) |
(→3. word-based 3-gram) |
||
(5位用户的32个中间修订版本未显示) | |||
第1行: | 第1行: | ||
− | 1. very initial, without any prunning, character based. Here is the size and perplexity. | + | == 1. very initial, without any prunning, character based. Here is the size and perplexity. == |
+ | |||
+ | The training is with Gigabytes except the cna data, and ppl testing is based on a sub set from the cna data (big52gb applied) | ||
+ | |||
+ | '''2gram:''' | ||
+ | |||
+ | 25M 2gram.4000.gz: 0 zeroprobs, logprob= -9.39983e+06 ppl= 161.965 ppl1= 177.141 | ||
+ | |||
+ | |||
+ | '''3gram:''' | ||
− | |||
47M 3gram.500.gz:0 zeroprobs, logprob= -6.34868e+06 ppl= 85.1361 ppl1= 94.2525 | 47M 3gram.500.gz:0 zeroprobs, logprob= -6.34868e+06 ppl= 85.1361 ppl1= 94.2525 | ||
+ | |||
117M 3gram.1000.gz :0 zeroprobs, logprob= -7.43809e+06 ppl= 80.6408 ppl1= 87.7439 | 117M 3gram.1000.gz :0 zeroprobs, logprob= -7.43809e+06 ppl= 80.6408 ppl1= 87.7439 | ||
+ | |||
195M 3gram.2000.gz:0 zeroprobs, logprob= -7.95872e+06 ppl= 79.9875 ppl1= 86.5196 | 195M 3gram.2000.gz:0 zeroprobs, logprob= -7.95872e+06 ppl= 79.9875 ppl1= 86.5196 | ||
+ | |||
221M 3gram.3000.gz:0 zeroprobs, logprob= -8.04799e+06 ppl= 80.2418 ppl1= 86.7277 | 221M 3gram.3000.gz:0 zeroprobs, logprob= -8.04799e+06 ppl= 80.2418 ppl1= 86.7277 | ||
+ | |||
229M 3gram.4000.gz:0 zeroprobs, logprob= -8.15697e+06 ppl= 82.6585 ppl1= 89.3392 | 229M 3gram.4000.gz:0 zeroprobs, logprob= -8.15697e+06 ppl= 82.6585 ppl1= 89.3392 | ||
− | 4gram: | + | |
+ | '''4gram:''' | ||
+ | |||
205M 4gram.500.gz:0 zeroprobs, logprob= -6.25395e+06 ppl= 79.6739 ppl1= 88.0716 | 205M 4gram.500.gz:0 zeroprobs, logprob= -6.25395e+06 ppl= 79.6739 ppl1= 88.0716 | ||
+ | |||
472M 4gram.1000.gz:0 zeroprobs, logprob= -7.21607e+06 ppl= 70.737 ppl1= 76.774 | 472M 4gram.1000.gz:0 zeroprobs, logprob= -7.21607e+06 ppl= 70.737 ppl1= 76.774 | ||
− | ---- | + | |
+ | == 2. pruning the 4k 3gram LM. == | ||
+ | |||
+ | {|class="wikitable" | ||
+ | ! Model ||2gram ||3gram || size || ppl || fst size | ||
+ | |- | ||
+ | | 1 || 1e-7 || 1e-7 || 30M || ppl= 102.796 || 860M | ||
+ | |- | ||
+ | |2 || 1e-6 || 1e-6 || 5M || ppl= 150.96 || 152M | ||
+ | |- | ||
+ | | 3 || 1e-7 || 1e-6 || 11M || ppl= 137.467 || 224M | ||
+ | |- | ||
+ | |} | ||
+ | |||
+ | == 3. word-based 3-gram == | ||
+ | |||
+ | tri-gram size: | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | | org || th-7 || th-7/6 || th-6 | ||
+ | |- | ||
+ | |10k: 52M || 23M || 8M || 4M | ||
+ | |- | ||
+ | |20k: 57M || 24M || 9M || 4M | ||
+ | |- | ||
+ | |} | ||
+ | |||
+ | final fst size: | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | | org || th-7 || th-7/6 || th-6 | ||
+ | |- | ||
+ | |10k: - || 770M || 193M || 135M | ||
+ | |- | ||
+ | |20k: - || - || 217M || 142M | ||
+ | |- | ||
+ | |} | ||
+ | |||
+ | Test is performed on 863 M49, LDA+LLT (tri2b), in terms of character error rate (CER). The NUM part is deleted from the decoding result. The pair after CER represents (1/acweight, t/utt). | ||
+ | |||
+ | {| class="wikitable" | ||
+ | !- !! th-6 !!th-7/6 !!th-7 | ||
+ | |- | ||
+ | |10k ||23.77(13,0.92)|| 22.41(11,0.93)|| 21.96(11,0.93) | ||
+ | |- | ||
+ | |20k ||21.92(13,0.99)|| 20.33(12,0.97)|| 19.38(12,0.96) | ||
+ | |- | ||
+ | |} | ||
+ | |||
+ | Results with LDA+MLLT+MMI | ||
+ | |||
+ | {| class="wikitable" | ||
+ | !- !! th-6 !!th-7/6 !!th-7 | ||
+ | |- | ||
+ | |10k || 22.95(13, 1.0)|| 21.83(13,1.0)|| 21.41(10, 0.98) | ||
+ | |- | ||
+ | |20k || 20.71(11, 1.1)|| 19.26(11, 1.1)|| 18.44(10, 1.1) | ||
+ | |- | ||
+ | |} | ||
+ | |||
+ | |||
+ | Results with LDA+MLLT+bMMI | ||
+ | |||
+ | {| class="wikitable" | ||
+ | !- !! th-6 !!th-7/6 !!th-7 | ||
+ | |- | ||
+ | |10k || 22.68(10,1.0) || 21.46(10,1.0) || 20.96(10,1.0) | ||
+ | |- | ||
+ | |20k || 20.39(12, 1.1) || 18.97(11,1.1)|| 18.23(10,1.1) | ||
+ | |- | ||
+ | |} |
2012年9月14日 (五) 08:23的最后版本
1. very initial, without any prunning, character based. Here is the size and perplexity.
The training is with Gigabytes except the cna data, and ppl testing is based on a sub set from the cna data (big52gb applied)
2gram:
25M 2gram.4000.gz: 0 zeroprobs, logprob= -9.39983e+06 ppl= 161.965 ppl1= 177.141
3gram:
47M 3gram.500.gz:0 zeroprobs, logprob= -6.34868e+06 ppl= 85.1361 ppl1= 94.2525
117M 3gram.1000.gz :0 zeroprobs, logprob= -7.43809e+06 ppl= 80.6408 ppl1= 87.7439
195M 3gram.2000.gz:0 zeroprobs, logprob= -7.95872e+06 ppl= 79.9875 ppl1= 86.5196
221M 3gram.3000.gz:0 zeroprobs, logprob= -8.04799e+06 ppl= 80.2418 ppl1= 86.7277
229M 3gram.4000.gz:0 zeroprobs, logprob= -8.15697e+06 ppl= 82.6585 ppl1= 89.3392
4gram:
205M 4gram.500.gz:0 zeroprobs, logprob= -6.25395e+06 ppl= 79.6739 ppl1= 88.0716
472M 4gram.1000.gz:0 zeroprobs, logprob= -7.21607e+06 ppl= 70.737 ppl1= 76.774
2. pruning the 4k 3gram LM.
Model | 2gram | 3gram | size | ppl | fst size |
---|---|---|---|---|---|
1 | 1e-7 | 1e-7 | 30M | ppl= 102.796 | 860M |
2 | 1e-6 | 1e-6 | 5M | ppl= 150.96 | 152M |
3 | 1e-7 | 1e-6 | 11M | ppl= 137.467 | 224M |
3. word-based 3-gram
tri-gram size:
org | th-7 | th-7/6 | th-6 |
10k: 52M | 23M | 8M | 4M |
20k: 57M | 24M | 9M | 4M |
final fst size:
org | th-7 | th-7/6 | th-6 |
10k: - | 770M | 193M | 135M |
20k: - | - | 217M | 142M |
Test is performed on 863 M49, LDA+LLT (tri2b), in terms of character error rate (CER). The NUM part is deleted from the decoding result. The pair after CER represents (1/acweight, t/utt).
- | th-6 | th-7/6 | th-7 |
---|---|---|---|
10k | 23.77(13,0.92) | 22.41(11,0.93) | 21.96(11,0.93) |
20k | 21.92(13,0.99) | 20.33(12,0.97) | 19.38(12,0.96) |
Results with LDA+MLLT+MMI
- | th-6 | th-7/6 | th-7 |
---|---|---|---|
10k | 22.95(13, 1.0) | 21.83(13,1.0) | 21.41(10, 0.98) |
20k | 20.71(11, 1.1) | 19.26(11, 1.1) | 18.44(10, 1.1) |
Results with LDA+MLLT+bMMI
- | th-6 | th-7/6 | th-7 |
---|---|---|---|
10k | 22.68(10,1.0) | 21.46(10,1.0) | 20.96(10,1.0) |
20k | 20.39(12, 1.1) | 18.97(11,1.1) | 18.23(10,1.1) |