“2024-09-30”版本间的差异

2024年10月7日 (一) 08:53的最后版本

People	This Week	Next Week	Task Tracking (DeadLine)
Dong Wang	AI graph (high education version)
Lantian Li	AI-Graph handbook v0.1 AI-Graph EN (12/50) Huawei TiDing 3.0 - Model Quantization BUPT/AI-Radiance trivial things
Ying Shi	Add 4 kinds of negative sampling strategies Optimized Text-enroll KWS code (deletion, substitution, insertion, and shuffle) and verify them to ensure no bugs. Find that new negative sampling will increase the difficulty of training which indicates that only depending on positional embedding is not enough. Reproduce conditional chain overlap asr (Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals) According to Jiaying's work the code released by the published paper can not work Write dominance-based conditional chain overlap asr by myself (in progress)
Zhenghai You	Exploring the role of speaker encoder in TSE[1] Joint traing Spk Enc have better separation effect, but the EER is poor Pretrain & Freezing Spk Enc EER well, but SI-SDR is poor Further explore the different impacts of using spk aug on different tasks The generality of SPK-AUG Refactored DPRNN-TSE results are reliable and have been accelerated from 87 hours to 32 hours
Junming Yuan
Chen Chen
Xiaolou Li	Use MFA on LRS3 to cut it into small segments Use discrete embedding of avhubert in vsp-llm training (Still training) Some idea of align video feature and LLM (Dense Connector, CL methods) Handover the data collection and get familiar with the process Data Collection: 3138 h (need to re-check, DDL: 10.15)
Zehua Liu	Baseline System VSP-LLM Try Qwen2.5-14B[2]
Pengqi Li
Wan Lin	Voxblink1 model training and testing [3]
Tianhao Wang	AudioSep reproduction problem: LAION CLAP needs 48kHz audio so the data needs to be up-resample
Xiaoxue Luo	AI-Graph High school handbook(v0.1)
Zhenyu Zhou	Model Quantization document submit Review conditional chain code
Junhui Chen	Voxblink1 model training and testing Writing test code for NS in ossi test.
Jiaying Wang
Yu Zhang	Fri Report Change SocioDojo Agent from ChatGPT-3.5-Turbo to Llama-3.1-8B (still working)
Wenqiang Du	Check primary school handbook(43/45) Release chinese and haining KWS model
Yang Wei
Lily	APSIPA workshop Tianjin and Prepare Friday's report Prepare for online-course AI radiance's daily work
Turi	Segmented audios in dataset into individual words. Paper reading
Yue Gu	Almost complete the revisions of my journal paper
Qi Qu	KWS Testing zh48 models on dataset of Mandarin Chinese w/ Guangdong accent: recall drops significantly. AED Evaluating third-party solution of baby crying detection. Misc. Preparing for live talk.

@@ 第6行： / 第6行： @@
 |Dong Wang
 ||
-*
+* AI graph (high education version)
 ||
 *
@@ 第47行： / 第47行： @@
 |Zhenghai You
 ||
-*
+* Exploring the role of speaker encoder in TSE[https://z1et6d3xtb.feishu.cn/docx/GHF8doRjDo50ihxGUPpcsZgLncb]
+** Joint traing Spk Enc have better separation effect, but the EER is poor
+** Pretrain & Freezing Spk Enc EER well, but SI-SDR is poor
+** Further explore the different impacts of using spk aug on different tasks
+* The generality of SPK-AUG
+** Refactored DPRNN-TSE results are reliable and have been accelerated from 87 hours to 32 hours
 ||
 *
@@ 第94行： / 第99行： @@
 |Zehua Liu
 ||
-*
+*Baseline System VSP-LLM
+*Try Qwen2.5-14B[https://z1et6d3xtb.feishu.cn/docx/JBsidACDVojhCaxFQLbcCVbsnAc?from=from_copylink]
 ||
 *
@@ 第139行： / 第145行： @@
 |Xiaoxue Luo
 ||
-*AI Graph high school Handbook(40/40)
+*AI-Graph High school handbook(v0.1)
 ||
 *
@@ 第150行： / 第156行： @@
 |Zhenyu Zhou
 ||
-*
+* Model Quantization document submit
+* Review conditional chain code
 ||
 *
@@ 第161行： / 第168行： @@
 |Junhui Chen
 ||
-*
+* Voxblink1 model training and testing
+** Writing test code for NS in ossi test.
 ||
 *
@@ 第183行： / 第191行： @@
 |Yu Zhang
 ||
-*
+* Fri Report
+* Change SocioDojo Agent from ChatGPT-3.5-Turbo to Llama-3.1-8B (still working)
 ||
 *
@@ 第216行： / 第225行： @@
 |Lily
 ||
-*
+* APSIPA workshop Tianjin and Prepare Friday's report
+* Prepare for online-course
+* AI radiance's daily work
 ||
 *
@@ 第226行： / 第237行： @@
 |Turi
 ||
-*
+* Segmented audios in dataset into individual words.
+* Paper reading
 ||
 *

“2024-09-30”版本间的差异

2024年10月7日 (一) 08:53的最后版本

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具