“TTS-project-synthesis”版本间的差异
来自cslt Wiki
第35行: | 第35行: | ||
*Interpolate the speaker-vector of different person | *Interpolate the speaker-vector of different person | ||
:* Female & Male with different ratio | :* Female & Male with different ratio | ||
+ | |||
(1) 0.0:1.0[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_0_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (1) 0.0:1.0[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_0_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(2) 0.1:0.9[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_1_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (2) 0.1:0.9[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_1_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(3) 0.2:0.8[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_2_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (3) 0.2:0.8[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_2_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(4) 0.3:0.7[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_3_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (4) 0.3:0.7[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_3_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(5) 0.4:0.6[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_4_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (5) 0.4:0.6[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_4_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(6) 0.5:0.5[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_5_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (6) 0.5:0.5[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_5_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(7) 0.6:0.4[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_6_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (7) 0.6:0.4[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_6_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(8) 0.7:0.3[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_7_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (8) 0.7:0.3[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_7_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(9) 0.8:0.2[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_8_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (9) 0.8:0.2[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_8_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(10) 0.9:0.1[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_9_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (10) 0.9:0.1[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_9_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | ||
+ | |||
(11) 1.0:0.0[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_10_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] | (11) 1.0:0.0[http://zhangzy.cslt.org/categories/tts/sample-wav/mimic-wangd-front-end/multi-speakers/mix/iterpolation/female01_male01/iterpolation_10_female01_male01_5_amdurTanh_acTanh_mlpg1_postfilter1.world.wav01.wav] |
2017年12月1日 (五) 03:00的版本
目录
Project name
Text To Speech
Project members
Dong Wang, Zhiyong Zhang
Introduction
xxx
Sample waves
Synthesis text:好雨知时节,当春乃发声,随风潜入夜,润物细无声
Mono-speaker TTS
- Female[1]
- Male[2]
- Child[3]
Multi-speaker mix-training without speaker-vector
- Female & Male[4]
- Female & Child[5]
- Male & Child[6]
Multi-speaker mix-training with speaker-vector
When synthesis, we just replace the speaker-vector for specific person.
- Specific person===
- Female[7]
- Male[8]
- Interpolate the speaker-vector of different person
- Female & Male with different ratio
(1) 0.0:1.0[9]
(2) 0.1:0.9[10]
(3) 0.2:0.8[11]
(4) 0.3:0.7[12]
(5) 0.4:0.6[13]
(6) 0.5:0.5[14]
(7) 0.6:0.4[15]
(8) 0.7:0.3[16]
(9) 0.8:0.2[17]
(10) 0.9:0.1[18]
(11) 1.0:0.0[19]