M2asr-2018-04-20-decision
来自cslt Wiki
Progress
- Data resource
- Uyghur: 250h seed speech data ready, 10k sentences for morpheme learning ready (XJU)
- Kazak: 300h seed speech data ready, 5k sentences for morpheme learning ready (XJU)
- Kirgiz: 0 speech data; 500k text sentences crawed. (XJU)
- Tibetan: seed speech data of 42 people, Lexicon with 50k words; 50M text + 40M new blog data collected (NMU)
- Mogolian: Lexicon with 30k words; 50M text collected; text sentences for seed speech dataset recording under preparation (NMU)
- Technical progress
- Multilingual decoding is done. Performance is good, and better than single language systems (THU)
- Zero resource ASR is undergoing: structure & knowledge transfer + learning with unlablled data (THU)
Problems
- Resource collection
- Seed data for Kirgiz and Mogoloian should be collected quickly. They should be done before August, 1st.
- Body data should be collected as soon as possible. Shiying will release a recording APP and a check platform for the collection. This should be done before Just 1st.
- Resource centeralization
- A key problem is that the resource has not been well managed. We should put all light resources (lexicon,transcription, recipe, tools) on github, heavy resources (speech data, text data) on disk but can be accessed by URL. All the resources should be indexible from the wiki.
- State-of-the-art recipe
- The research has not been put on a unified baseline. We should set up the baseline systems for the 5 languages, so that individual research can has a good reference.
- We also need to put the multilingual ASR system onto github, so that all can start their research from the state-of-the-art.
- Tang Zhiyuan will be response for the above task, and Shiying will be the main researchers (done before June 1st).