2019年10月29日 (二) 12:06的版本

Introduction

CN-Celeb, a large-scale Chinese celebrities dataset collected `in the wild'.

Members

Current：Dong Wang, Yunqi Cai, Lantian Li, Yue Fan, Jiawen Kang
History：Ziya Zhou, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang

Target

Collect audio data of 1,000 Chinese celebrities.
Automatically clip videoes through a pipeline including face detection, face recognition, speaker validation and speaker diarization.
Create a benchmark database for speaker recognition community.

Future Plans

Augment the database to 10,000 people.
Build a model between SyncNet and Speaker_Diarization based on LSTM, which can learn the relationship of them.

Basic method

Environments: Tensorflow, PyTorch, Keras, MxNet
Face detection and tracking based on RetinaFace and ArcFace models.
Active speaker verification based on SyncNet model.
Speaker Diarization based on UIS-RNN model.
Double check by speaker recognition based on VGG model.
Input: Pictures and videos of POIs (Persons of Interest).
Output: well-labelled videos of POIs (Persons of Interest).

GitHub of our project

celebrity-audio-collection

Reports

Stage Report v1.0

References

Deng et al., "RetinaFace: Single-stage Dense Face Localisation in the Wild", 2019. [1]
Deng et al., "ArcFace: Additive Angular Margin Loss for Deep Face Recognition", 2018, [2]
Wang et al., "CosFace: Large Margin Cosine Loss for Deep Face Recognition", 2018, [3]
Liu et al., "SphereFace: Deep Hypersphere Embedding for Face Recognition", 2017[4]
Zhong et al., "GhostVLAD for set-based face recognition", 2018. link
Chung et al., "Out of time: automated lip sync in the wild", 2016.link
Xie et al., "UTTERANCE-LEVEL AGGREGATION FOR SPEAKER RECOGNITION IN THE WILD", 2019. link
Zhang1 et al., "FULLY SUPERVISED SPEAKER DIARIZATION", 2018. link

@@ 第1行： / 第1行： @@
-=CN-Celeb=
+=Introduction=
-* A large-scale Chinese celebrities dataset collected `in the wild'.
+* CN-Celeb, a large-scale Chinese celebrities dataset collected `in the wild'.
-* Members：Dong Wang, Yunqi Cai, Lantian Li, Yue Fan, Jiawen Kang
-* Historical Members：Ziya Zhou, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang
+=Members=
+* Current：Dong Wang, Yunqi Cai, Lantian Li, Yue Fan, Jiawen Kang
+* History：Ziya Zhou, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang
 ===Target===
@@ 第9行： / 第12行： @@
 * Collect audio data of 1,000 Chinese celebrities.
 * Automatically clip videoes through a pipeline including face detection, face recognition, speaker validation and speaker diarization.
-* Create a database.
+* Create a benchmark database for speaker recognition community.
-===未来计划===
+===Future Plans===
 * Augment the database to 10,000 people.
 * Build a model between SyncNet and Speaker_Diarization based on LSTM, which can learn the relationship of them.
+===Basic method===
-===基本方法===
+* Environments: Tensorflow, PyTorch, Keras, MxNet
+* Face detection and tracking based on RetinaFace and ArcFace models.
+* Active speaker verification based on SyncNet model.
+* Speaker Diarization based on UIS-RNN model.
+* Double check by speaker recognition based on VGG model.
+* Input: Pictures and videos of POIs (Persons of Interest).
+* Output: well-labelled videos of POIs (Persons of Interest).
-* Tensorflow, PyTorch, Keras, MxNet 实现
+===GitHub of our project===
-* 检测、识别人脸的RetinaFace和ArcFace模型，说话人识别的SyncNet模型，Speaker Diarization的UIS-RNN模型
-* 输入为目标主人公的视频、目标主人公的面部图片
-* 输出为该视频中主人公声音片段的时间标签
-===项目GitHub地址===
 [https://github.com/celebrity-audio-collection/videoprocess celebrity-audio-collection]
-===项目报告===
+===Reports===
-[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/%E6%96%87%E4%BB%B6:C-STAR.pdf v1.0阶段性报告]
+[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/%E6%96%87%E4%BB%B6:C-STAR.pdf Stage Report v1.0]
-===参考文献===
+===References===
 * Deng et al., "RetinaFace: Single-stage Dense Face Localisation in the Wild", 2019. [https://arxiv.org/pdf/1905.00641.pdf]
 * Deng et al., "ArcFace: Additive Angular Margin Loss for Deep Face Recognition", 2018, [https://arxiv.org/abs/1801.07698]

“CN-Celeb”版本间的差异

2019年10月29日 (二) 12:06的版本

目录

Introduction

Members

Target

Future Plans

Basic method

GitHub of our project

Reports

References

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具