“2024-11-25”版本间的差异

2024年11月25日 (一) 10:58的版本

People	This Week	Next Week	Task Tracking (DeadLine)
Dong Wang	2nd round of check for AI handbook middle school. Deal with pictures in AI handbook (primary & middle). Start to check AI handbook high school. Check AI book for Tianjin medical school.
Lantian Li	Complete my CSTR Report Go on AI-Graph EN Chapter 4 Polish 2025 Daily Sign
Ying Shi	Design cohort- conditional chain multi-talker ASR with round-RNN WER result : round-1 32.15% , round-2: 69.69% round-3: 92.33% For 500 utterances sub-test set: Only 28% of the sentences have a recognition order that matches the cosine distance. Prepare for Huawei's interview.
Zhenghai You	Huawei TSE(Train models that better fit the scene)[1]
Junming Yuan	Comparable results between Clean-HuBERT, Cocktail-HuBERT, and MT-HuBERT[2] Bad news: Cocktail-HuBERT > Clean-HuBERT > MT-HuBERT
Chen Chen
Xiaolou Li	Data process CVS3 1/4 already cut from original video, waiting for pre-process Copying pre-processed GongAn video data from gonganbu VSR Contrastive Loss Exp Inspired by paper [3] Main idea: For better align visual feature to LLM input, calculate cos similarity of target and video feature, set the biggest as the positive pair. Result: Under training Paper Reading
Zehua Liu	Rebutall writing Iterative training and inference Iter-1(45.53%) < Iter-2(45.00%) < Iter-3(44.85%)
Pengqi Li	Begin writing paper about importance of phonemes analysis work. Reading a doctoral thesis about speaker explainability[4].
Wan Lin
Tianhao Wang	Experiments about query embedding conditional approach: SDR: FiLM (7.492) > self-attention (6.573)
Zhenyu Zhou	Speaker identity based conditional chain proposal[5] prepare Interim Report
Junhui Chen
Jiaying Wang
Yu Zhang	Huawei AED data aug & human annotated dataset [6] Finance Paper reading, reproduce local Llama version of StockAgent [7] (a LLM based market simulation framework)
Wenqiang Du	Training of New language Models(HeNan)[8] Training of New language Models(ChongQing)[9]
Yang Wei	Fix some bugs about keyword sampling in text enroll kws training code. Add spec augmentation for text enroll kws training.
Lily
Turi	Paper reading ICASSP 2025 rebuttal
Yue Gu	Synthesis about 1h data for each target speaker, then using these data to train the adapter module.[10] writing taslp paper
Qi Qu	Finding ideal thresholds and deploying cloud services for KWS models: `zh48_guangdong` and `zh48_haining20`. Located and fixed a bug in FunASR which may lead to segmentation fault. Built service with extended gRPC protocol. Analysis of some AED (cries and slaps) FAs.

@@ 第80行： / 第80行： @@
 |Xiaolou Li
 ||
-*
+* Data process
+** CVS3 1/4 already cut from original video, waiting for pre-process
+** Copying pre-processed GongAn video data from gonganbu
+* VSR Contrastive Loss Exp
+** Inspired by paper [https://arxiv.org/abs/2408.11813]
+** Main idea: For better align visual feature to LLM input, calculate cos similarity of target and video feature, set the biggest as the positive pair.
+** Result: Under training
+* Paper Reading
 ||
 *

“2024-11-25”版本间的差异

2024年11月25日 (一) 10:58的版本

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具