“ASR-events-OC16-details”版本间的差异
第2行: | 第2行: | ||
The OC16 MixASR-CHEN challenge is part of the special session "mixlingual speech and language processing" on O-COCOSDA 2016. The challenge is a Chinese-English mixed | The OC16 MixASR-CHEN challenge is part of the special session "mixlingual speech and language processing" on O-COCOSDA 2016. The challenge is a Chinese-English mixed | ||
− | speech recognition task, where the host | + | speech recognition task, where the host and embedding languages are Chinese and English respectively. |
==Database== | ==Database== |
2016年6月3日 (五) 06:47的版本
目录
OC16 MixASR-CHEN Challenge
The OC16 MixASR-CHEN challenge is part of the special session "mixlingual speech and language processing" on O-COCOSDA 2016. The challenge is a Chinese-English mixed speech recognition task, where the host and embedding languages are Chinese and English respectively.
Database
The challenge requires three resources:
OC16-MixCHEN80
OC16-MixCHEN80 is a speech database provided by SpeechOcean for this challenge. The main features involve:
- XX speakers (XX males, XX females)
- Microphone channel
- XXX utterances per speaker in average, amounting to 80 hours of speech signals in total.
- Transcriptions are provided
THCHS30
THCHS30 is a pure speech database provided by CSLT@Tsinghua University. All the resources of THCHS30 can be used to improve the system, especially the lexicon and LM. The data is available at:
CMU English dictionary
To recognize English words, CMU English dictionary 0.7b is allowed to be used.
http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/cmudict-0.7b
Participation rule
- Participants of the special session OR the challenge can apply for OC16-MixCHEN80 by sending emails to the organizers (see below).
- Agreement for the usage of OC16-MixCHEN80 should be signed and returned to the organizer before the data can be downloaded.
- Publications based on OC16-MixChen80 should cite the following paper: "Dong Wang, Xuewei Zhang, Qing Cheng, OC16-MixChen80: a Chinese-English Mixlingual database and a DNN baseline"
Challenge procedure
- Jun 4, data ready for release and accept registration request.
- July 15-17, test data release. Participants can response with their decoding results in 24 hours.
- July 20, participants can obtain their own WER.
- OC16, summary will be given on the special session.
Registration
If you are interested to participate the challenge, or if you have any other questions, comments, suggestions about the challenge, please send email to the organizer:
- Dr. Dong Wang (wangdong99@mails.tsinghua.edu.cn)
- Mr. Difei Tang (tangdifei@speechocean.com)
- Ms. Chen Qing (chenqing@speechocean.com)