“2024-10-21”版本间的差异

2024年10月21日 (一) 11:01的最后版本

People	This Week	Next Week
Dong Wang	Primary School AI hand book (20-30)
Lantian Li	AI-Graph EN (25/50) Complete CSTR intro report （11.18)
Ying Shi	Cohort-Overlap ASR condition on real decode result Design stop criterion Cohort-Speech separation several configs for Dual-path model group work
Zhenghai You	Weekly report
Junming Yuan	The result of feat-mask/time-mask MT-HuBERT [1]
Xiaolou Li	AVHuBERT unit exp dc connector (↑0.8% than discrete unit) concat feature and embedding (↑2% than discrete unit, ↓0.3% than baseline) CVS3 quality check (30h totally) [2] This work is help by Zehua, Linwan, Tianhao MLLM system with audio output design
Zehua Liu	Verify VSR data Finish Data Verification Report ICL work(CER: 47.87% < CER: 51.08%) Time Mask matters[3]
Pengqi Li	Complete the final report of the doctoral innovation project(School) Exploring the Consistency of TAO and LayerCAM Results on different models and datasets. Conclusion and hypothesis[4]
Wan Lin	help VSR data verification experiment in voxblink2 [5]
Tianhao Wang	adjust the code of AudioSep (CLAP) to support multi-mix and audio-query (in training) some project testing
Xiaoxue Luo	AudioSep reproduction evaluate the performance of AudioSep comparative experiment between AudioSep and baseline system(CLIPSep) adjusting the code
Zhenyu Zhou	conditional chain 2-mix results reproduction（sisidr: 10.714 -> 15.6） model quantization finial version submission
Junhui Chen	Experiments for NS Look for speaker detection model with Resnet34 for frame label
Jiaying Wang
Yu Zhang	SocioDojo Llama 3.1 8B investment task acc return is about 10% below nasdaq 100 index	add more professional information source, such as WSJ (current is Tweets Trending, which is too entertainment-oriented) control the BUY/SELL amount of Actuator (current investments ratio is too high) reproduce other Multi Agent investment pipeline such as FinAgent or FinRobot
Wenqiang Du	Participated in an AI competition
Yang Wei	Train text enroll KWS model and test with Aibabel dialect data.
Lily
Turi	Whisper finetuning on sagalee with encoder frozen, whisper-large-v3 (20.5 WER) Finetuning LLM Finetuned Qwen2.5-0.5B on conversation dataset translated from English to Oromo
Yue Gu	write the cover letter design a new speaker adaptation framework
Qi Qu	AED: New CED-based classifiers deployed onto devices, yielding acceptable performance. KWS: Quantization and format conversion of production models for deployment on embedded device w/ NPU. Default quantization mode leads to unacceptable loss of precision. Will try hybrid quantization. Text-enrollment KWS: some dynamic dimensions misinterpreted as constant duration exportation to ONNX.

@@ 第6行： / 第6行： @@
 |Dong Wang
 ||
-*
+* Primary School AI hand book (20-30)
 ||
 *
@@ 第17行： / 第17行： @@
 |Lantian Li
 ||
-*
+* AI-Graph EN (25/50)
+* Complete CSTR intro report （11.18)
 ||
 *
@@ 第28行： / 第29行： @@
 |Ying Shi
 ||
-*
+*  Cohort-Overlap ASR
+** condition on real decode result
+** Design stop criterion
+* Cohort-Speech separation
+** several configs for Dual-path model
+* [https://z1et6d3xtb.feishu.cn/docx/X3C4dFRoKo3PATxvXWUc40LOnye?from=from_copylink group work ]
 ||
 *
@@ 第39行： / 第45行： @@
 |Zhenghai You
 ||
-*
+* Weekly report
 ||
 *
@@ 第60行： / 第66行： @@
 |Xiaolou Li
 ||
-*
+* AVHuBERT unit exp
+** dc connector (↑0.8% than discrete unit)
+** concat feature and embedding (↑2% than discrete unit, ↓0.3% than baseline)
+* CVS3 quality check (30h totally) [https://z1et6d3xtb.feishu.cn/drive/folder/HGHbfyCJRlLYzUdSlEicOEztnYc]
+* This work is help by Zehua, Linwan, Tianhao
+* MLLM system with audio output design
 ||
 *
@@ 第71行： / 第82行： @@
 |Zehua Liu
 ||
-*
+*Verify VSR data
+*Finish Data Verification Report
+*ICL work(CER: 47.87% < CER: 51.08%)
+*Time Mask matters[https://z1et6d3xtb.feishu.cn/docx/JBsidACDVojhCaxFQLbcCVbsnAc?from=from_copylink]
 ||
 *
@@ 第82行： / 第96行： @@
 |Pengqi Li
 ||
-*
+* Complete the final report of the doctoral innovation project(School)
+* Exploring the Consistency of TAO and LayerCAM Results on different models and datasets.
+** Conclusion and hypothesis[https://z1et6d3xtb.feishu.cn/docx/Elh0d3t2RoEcUOxoIh2cBQG5nCc]
 ||
 *
@@ 第93行： / 第109行： @@
 |Wan Lin
 ||
-*
+* help VSR data verification
+* experiment in voxblink2 [https://z1et6d3xtb.feishu.cn/docx/MxBNdPbLao0tsoxkBVCcUgUoneh?from=from_copylink]
 ||
 *
@@ 第104行： / 第121行： @@
 |Tianhao Wang
 ||
-*
+* adjust the code of AudioSep (CLAP) to support multi-mix and audio-query (in training)
+* some project testing
 ||
 *
@@ 第117行： / 第135行： @@
 *AudioSep reproduction
 **evaluate the performance of AudioSep
-**comparison experiment between AudioSeo and baseline system(CLIPSep)
+**comparative experiment between AudioSep and baseline system(CLIPSep)
 ***adjusting the code
 ||
@@ 第129行： / 第147行： @@
 |Zhenyu Zhou
 ||
-*
+*conditional chain 2-mix results reproduction（sisidr: 10.714 -> 15.6）
+*model quantization finial version submission
 ||
 *
@@ 第140行： / 第159行： @@
 |Junhui Chen
 ||
-*
+* Experiments for NS
+* Look for speaker detection model with Resnet34 for frame label
 ||
 *
@@ 第162行： / 第182行： @@
 |Yu Zhang
 ||
-*
+* SocioDojo Llama 3.1 8B investment task
+** acc return is about 10% below nasdaq 100 index
 ||
-*
+* add more professional information source, such as WSJ (current is Tweets Trending, which is too entertainment-oriented)
+* control the BUY/SELL amount of Actuator (current investments ratio is too high)
+* reproduce other Multi Agent investment pipeline such as FinAgent or FinRobot
 ||
 *
@@ 第184行： / 第207行： @@
 |Yang Wei
 ||
-*
+* Train text enroll KWS model and test with Aibabel dialect data.
 ||
 *
@@ 第204行： / 第227行： @@
 |Turi
 ||
-*
+* Whisper finetuning on sagalee
-||
+** with encoder frozen, whisper-large-v3 (20.5 WER)
+* Finetuning LLM
+** Finetuned Qwen2.5-0.5B on conversation dataset translated from English to Oromo
 *
 ||
@@ 第212行： / 第237行： @@
 |Yue Gu
 ||
-*
+* write the cover letter
+* design a new speaker adaptation framework
 ||
 *
@@ 第221行： / 第247行： @@
 |Qi Qu
 ||
-*
+* AED:
+** New CED-based classifiers deployed onto devices, yielding acceptable performance.
+* KWS:
+** Quantization and format conversion of production models for deployment on embedded device w/ NPU. Default quantization mode leads to unacceptable loss of precision. Will try hybrid quantization.
+** Text-enrollment KWS: some dynamic dimensions misinterpreted as constant duration exportation to ONNX.
 ||
 *

“2024-10-21”版本间的差异

2024年10月21日 (一) 11:01的最后版本

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具