2017年12月1日 (五) 03:41的版本

Project name

Text To Speech

Project members

Dong Wang, Zhiyong Zhang

Introduction

Text To Speech

Sample waves

Synthesis text:好雨知时节，当春乃发声，随风潜入夜，润物细无声

Mono-speaker TTS

Female[1]

Male[2]

Child[3]

Multi-speaker mix-trainingr

Without Speaker-vector

Female & Male[4]

Female & Child[5]

Male & Child[6]

With speaker-vector

When synthesis, we just replace the speaker-vector for specific person.

Specific person===

Female[7]

Male[8]

Interpolate the speaker-vector of different person

Female & Male with different ratio

(1) 0.0:1.0[9]

(2) 0.1:0.9[10]

(3) 0.2:0.8[11]

(4) 0.3:0.7[12]

(5) 0.4:0.6[13]

(6) 0.5:0.5[14]

(7) 0.6:0.4[15]

(8) 0.7:0.3[16]

(9) 0.8:0.2[17]

(10) 0.9:0.1[18]

(11) 1.0:0.0[19]

Mono-speaker Emotion TTS

Specific emotion

Neutral emotion [20]
Happy emotion [21]
Sorrow emotion [22]
Angry emotion [23]

Interpolation emotion

Angry & neutral with different ratio

(1) 0.0:1.0 [24]
(2) 0.1:0.9 [25]
(3) 0.2:0.8 [26]
(4) 0.3:0.7 [27]
(5) 0.4:0.6 [28]
(6) 0.5:0.5 [29]
(7) 0.6:0.4 [30]
(8) 0.7:0.3 [31]
(9) 0.8:0.2 [32]
(10) 0.9:0.1 [33]
(11) 1.0:0.0 [34]

@@ 第18行： / 第18行： @@
 *Child[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/huilian/child01.neutral/child01-neutral_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
-==Multi-speaker mix-training without speaker-vector==
+==Multi-speaker mix-trainingr==
+===Without Speaker-vector===
 *Female & Male[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/female01-male01/female01-male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
@@ 第26行： / 第27行： @@
-==Multi-speaker mix-training with speaker-vector==
+===With speaker-vector===
 When synthesis, we just replace the speaker-vector for specific person.
 *Specific person===
@@ 第57行： / 第58行： @@
 ::*(11) 1.0:0.0[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_10_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+==Mono-speaker Emotion TTS==
+*Specific emotion
+:* Neutral emotion [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/x-neutral_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+:* Happy emotion [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/x-happy_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+:* Sorrow emotion [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/x-sorrow_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+:* Angry emotion [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/x-angry_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+*Interpolation emotion
+:* Angry & neutral with different ratio
+::*(1) 0.0:1.0 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_0_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(2) 0.1:0.9 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_1_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(3) 0.2:0.8 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_2_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(4) 0.3:0.7 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_3_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(5) 0.4:0.6 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_4_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(6) 0.5:0.5 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(7) 0.6:0.4 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_6_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(8) 0.7:0.3 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_7_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(9) 0.8:0.2 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_8_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(10) 0.9:0.1 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/mix-emotion-angry-neutral_1_9_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]
+::*(11) 1.0:0.0 [http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/emotion/roobo.child/x-angry_1_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav]

“TTS-project-synthesis”版本间的差异

2017年12月1日 (五) 03:41的版本

目录

Project name

Project members

Introduction

Sample waves

Mono-speaker TTS

Multi-speaker mix-trainingr

Without Speaker-vector

With speaker-vector

Mono-speaker Emotion TTS

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具