Last week:
1. exploring the bridges between LSTM and GRU (convert LSTM to GRU to find what's important as GRU performs better than LSTM on WSJ ASR);
2. several attempts for residual learning on these two gated recurrent networks;
This week:
1.