2013-04-19
来自cslt Wiki
1. Data sharing
(1) AM/lexicon/LM are shared.
(2) LM count files are still in transfering.
2. DNN progress
(1) 400 hour BN model.
(2) Tencent test result: 70h training data(2 day, 15 machines, 10 threads),
88k LM, general test case:
gmmi-bmmi: 38.7%
dnn-1: 28% 11 frame window, phone-based tree
dnn-2: 34% 9 frame window, state-based tree
(3) GPU & CPU merge. Invesigate the possibility to merge GPU and
CPU code. Try to find out an easier way. (1 week)
(4) L-1 sparse initial training.
3.Kaldi/HTK merge
(1) HTK2Kaldi: the tool with Kaldi does not work. (2) Kaldi2HTK: done with implementation. Testing?
4. Embedded progress
(1). Some large performance (speed) degradation with the embedded platform(1/60).
(2). Planning for sparse DNN.
(3). QA LM training, still failed. Mengyuan need more work on this.