“AP17:OLR-special session”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
Introduction
Title: Tibetan speech database construction
第67行: 第67行:
  
 
===Title: Tibetan speech database construction===
 
===Title: Tibetan speech database construction===
*Author: Guanyu Li, Hongzhi Yu
+
*Author: Guanyu Li, Hongzhi Yu, Jinghao Yan
  
 
*Abstract: Tibetan is an important low-resource language in China. The syllable structure of Tibetan is similar  
 
*Abstract: Tibetan is an important low-resource language in China. The syllable structure of Tibetan is similar  
 
as Chinese, but the composition rules in orthographic forms is highly complex. Additionally, the lexicon  
 
as Chinese, but the composition rules in orthographic forms is highly complex. Additionally, the lexicon  
 
resource is far from standard and rich. This paper describes our recent progression on Tibetan
 
resource is far from standard and rich. This paper describes our recent progression on Tibetan
resource construction supported by the NSFC M2ASR project.  
+
resource construction supported by the NSFC M2ASR project.
  
 
===Title: A large Kazak speech database and a speech recognition baseline===
 
===Title: A large Kazak speech database and a speech recognition baseline===

2017年5月12日 (五) 00:38的版本

Title

Multilingual speech and language processing for minority languages

Organizers

Dong Wang: Tsinghua University (wangdong99@mails.tsinghua.edu.cn)

Dr. Dong Wang got his PhD degree at the University of Edinburgh, and worked in Oracle, IBM, and Nuance. He is now an assistant professor at the certer for speech and language technologies (CSLT) at Tsinghua University. Dr. Wang’s research interest covers speech processing, language processing and financial processing. He has published more than 80 academic papers in the related area, including three best paper awards. Dr. Wang plays active roles in the speech research community: he serves as the secretary in national conference of machine-man speech communication (NCMMSC) and a country representative of the mainland China in Oriental COCOSDA. He was the local chair of ChinaSIP 2013, special session co-chair of ISCSLP 14 and plenary talk co-chair of ISCSLP 16. Dr. Wang is now serving as the vice Chair of the SLA track of APSIPA.


Guanyu Li: Northwest National University (guanyu-li@163.com)

Dr. Guanyu Li got his PhD degree at the Northwest University for Nationalities, Gansu Province, China. He worked in several ERP software development companies as a developmental engineer, and is now an associate professor at the Northwest University for Nationalities and the Key Laboratory of National Language Intelligent Processing,Gansu Province. His research interest includes speech processing for minor languages in China, especially speech recognition and speech synthesis. In recent years, he published more than ten papers in related areas.


Mijit Ablimit: Xinjiang University (mijit@xju.edu.cn) Dr. Mijit Ablimit got his PhD degree at Kyoto University of Japan. He is now an associate professor at the Information Technology and Engineering college of Xinjiang University. His research interest covers speech, language, and multilinuage information processing for less popular languages of China.

Target track

Speech and Language processing

Introduction

Minor- and multi-lingual phenomenon is a important for modern international societies. This special session focuses on minor- and multi-lingual speech and language processing, including but not limited to the following topics:

  • Minor- and Multi-lingual phonetic and phonological analysis
  • Minor- and Multi-lingual speech recognition
  • Minor- and Multi-lingual speaker recognition
  • Minor- and Multi-lingual speech synthesis
  • Minor- and Multi-lingual language understanding
  • Resource construction for minority languages

Potential Papers

Title: Prior-constrained multilingual speech recognition

  • Author: Ying Shi, Zhiyuan Tang, Dong Wang
  • Abstract: Conventional multilingual speech recognition follows ether a tandem approach (language identification)

or parallel architecture (parallel decoding). This paper presented a novel prior-constrained approach that conduct the decoding in a multilingual linguistic space, where a prior of the language is used to constrain the decoding frame by frame. Our experiments found that this approach can realize true simultaneous multilingual speech recognition.


Title: Memory-based Uyghur-Chinese Translation

  • Author: Shiyue Zhang, Guli, Mijit Ablimit, Askar Hamdulla
  • Abstract: Neural machine translation (NMT) has achieved significant performance. However, this NMT approach

has not yet effectively applied to minor languages such as Uyghur to Chinese translation. The main problem here is that the limited training data does not support an end-to-end neural learning. In this paper, we propose to use a memory structure to assist the NMT inference under the condition of limited resource languages. Our experiments demonstrated that the this approach is highly efficient compared to the vanilla NMT, and outperforms the conventional statistical machine translation (SMT) approach.

Title: Resource construction for Mongolia

  • Author: Shipeng Xu, Guanyu Li, Hongzhi Yu
  • Abstract: Mongolia is a typical low-resource language. The resource limitation is in various aspects, from acoustic

analysis, phonetic rules, lexicon, speech and text data. This paper describes our recent progression on Mongolia resource construction supported by the NSFC project.

Title: Tibetan speech database construction

  • Author: Guanyu Li, Hongzhi Yu, Jinghao Yan
  • Abstract: Tibetan is an important low-resource language in China. The syllable structure of Tibetan is similar

as Chinese, but the composition rules in orthographic forms is highly complex. Additionally, the lexicon resource is far from standard and rich. This paper describes our recent progression on Tibetan resource construction supported by the NSFC M2ASR project.

Title: A large Kazak speech database and a speech recognition baseline

  • Author: Askar Hamdulla, Ying Shi
  • Abstract: We describe the construction process of a large scale Kazak speech database. The database involves

150 hours of speech signals, recorded by more than 200 speakers. A speech recognition baseline


Title: Multilingual resource construction for Uyghur, Kazak, Kirghiz languages

  • Author: Mijit Ablimit, Askar Hamdulla, Ying Shi, Dong Wang
  • Abstract: Minority languages, especially spoken languages, are strongly influenced by major languages or mixing each other. So a platform of uniform phonetic and morphological processing methods can provide a methodology and extra resource for the less popular languages. This paper describes multi-language phonetic and morphological tools and corpus compilation processing for some resource scares languages.