“CN-Celeb”版本间的差异

2019年10月31日 (四) 07:55的版本

Augment the database to 10,000 people.
Build a model between SyncNet and Speaker_Diarization based on LSTM, which can learn the relationship of them.

All the resources contained in the database are free for research institutes and individuals.
No commerical usage is permitted.

Deng et al., "RetinaFace: Single-stage Dense Face Localisation in the Wild", 2019. [1]
Deng et al., "ArcFace: Additive Angular Margin Loss for Deep Face Recognition", 2018, [2]
Wang et al., "CosFace: Large Margin Cosine Loss for Deep Face Recognition", 2018, [3]
Liu et al., "SphereFace: Deep Hypersphere Embedding for Face Recognition", 2017[4]
Zhong et al., "GhostVLAD for set-based face recognition", 2018. [5]
Chung et al., "Out of time: automated lip sync in the wild", 2016.[6]
Xie et al., "Utterance-level Aggregation For Speaker Recognition In The Wild", 2019. [7]
Zhang1 et al., "Fully Supervised Speaker Diarization", 2018. [8]

@@ 第24行： / 第24行： @@
 * Output: well-labelled videos of POIs (Persons of Interest).
-===GitHub of This Project===
+===Source Code===
-[https://github.com/celebrity-audio-collection/videoprocess celebrity-audio-collection]
+* Collection Pipeline: [https://github.com/celebrity-audio-collection/videoprocess celebrity-audio-collection]
+* Baseline Systems:
 ===Reports===