Mengyuan Zhao 2015-10-12
来自cslt Wiki
- LSTM training, Training on large dataset nearly done, get some conclusions:
- MPE cannot get significant performance improvement as DNN.
- MPE is easy to diverge.
- Training seems over-fitting with current network config (2*512). wer on tr-set is 2%(abs) lower than cv-set.
- 4-layer LSTM is better than 2-layer. Still testing differnt network config on 120h dataset.
- Reproducing "Self-informed nnet structure" on the paper cvss-464.
- Looking into kaldi code in detail and modifying it.