ASR-events-OC16-details

来自cslt Wiki
2016年6月3日 (五) 06:32Cslt讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索

OC16 MixASR-CHEN Challenge

The OC16 MixASR-CHEN challenge is part of the special session "mixlingual speech and language processing" on O-COCOSDA 2016. The challenge is a Chinese-English mixed speech recognition task, where the host language is Chinese, while English words are embedded occasionally.

Database

The challenge requires three resources:

OC16-MixCHEN80

OC16-MixCHEN80 is a speech database provided by SpeechOcean for this challenge. The main features involve:

  • XX speakers (XX males, XX females)
  • Microphone channel
  • XXX utterances per speaker in average, amounting to 80 hours of speech signals in total.
  • Transcriptions are provided

THCHS30

THCHS30 is a pure speech database provided by CSLT@Tsinghua University. All the resources of THCHS30 can be used to improve the system, especially the lexicon and LM. The data is available at:

http://www.openslr.org/18/

CMU English dictionary

To recognize English words, CMU English dictionary 0.7b is allowed to be used.

http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/cmudict-0.7b


Participation rule

Registration

If you are interested to participate the challenge, or if you have any other questions, comments, suggestions about the challenge, please send email to the organizer:

  • Dr. Dong Wang (wangdong99@mails.tsinghua.edu.cn)
  • Mr. Difei Tang (tangdifei@speechocean.com)
  • Ms. Chen Qing (chenqing@speechocean.com)