“CN-Celeb”版本间的差异

2019年11月1日 (五) 02:30的版本

Augment the database to 10,000 people.
Build a model between SyncNet and Speaker_Diarization based on LSTM, which can learn the relationship of them.

All the resources contained in the database are free for research institutes and individuals.
No commerical usage is permitted.

Deng et al., "RetinaFace: Single-stage Dense Face Localisation in the Wild", 2019. [1]
Deng et al., "ArcFace: Additive Angular Margin Loss for Deep Face Recognition", 2018, [2]
Wang et al., "CosFace: Large Margin Cosine Loss for Deep Face Recognition", 2018, [3]
Liu et al., "SphereFace: Deep Hypersphere Embedding for Face Recognition", 2017[4]
Zhong et al., "GhostVLAD for set-based face recognition", 2018. [5]
Chung et al., "Out of time: automated lip sync in the wild", 2016.[6]
Xie et al., "Utterance-level Aggregation For Speaker Recognition In The Wild", 2019. [7]
Zhang1 et al., "Fully Supervised Speaker Diarization", 2018. [8]

@@ 第33行： / 第33行： @@
 * Collection Pipeline: [https://github.com/celebrity-audio-collection/videoprocess celebrity-audio-collection]
-* Baseline Systems:
+* Baseline Systems: [https://github.com/kjw11/kaldi/tree/5.4/egs/cn-celeb kaldi-cn-celeb]
 ===Download===