“CN-CVS”版本间的差异

2022年10月25日 (二) 06:24的版本

Collect audio and video data of more than 2500 Mandarin speakers.
Automatically clip videos through a pipeline including shot detection, VAD, face detection, face tracker, audio-visual synchronization detection.
Manually annotate speaker identity, human check data quality.
Create a benchmark database for video to speech synthesis task.

TODO

TODO

All the resources contained in the database are free for research institutes and individuals.
No commerical usage is permitted.

@@ 第32行： / 第32行： @@
 ===Source Code===
-* Collection Pipeline: TODO
+* Collection Pipeline: https://github.com/sectum1919/mvs_data_collector
+* xTS: TODO
+* VCA-GAN: TODO
 ===Download===