“2024-10-28”版本间的差异

2024年10月28日 (一) 10:59的最后版本

People	This Week	Next Week
Dong Wang	AI primary book done
Lantian Li	AI-Graph EN (1-20 finalized) Design 2025 Daily Posts
Ying Shi	revise the code about cohort-overlap asr [the training is in progress] Support arbitrary source mixing training Use the real hypothesis as condition by Token error rate Design stop criterion
Zhenghai You	Introduce more hard samples to improve model performance[1] SPK-AUG with same length: There is an improvement, but the SI-SDR decreases when hard sample rate increases Design more hard samples
Junming Yuan	The result of time-mask MT-HuBERT [2] A sad news
Chen Chen
Xiaolou Li	VTS with LLM structure design and baseline code writing [3]
Zehua Liu	Reading Papper about In-Context-Learning in ASR Training model with Adaptive Time Mask Try In-Context-Learning with only previous sentence[4] VTS Project Report starts
Pengqi Li	Consistency of TAO and LayerCAM Change TAO from input to final conv layer and obtain more consistency.(Aishell:0.93 in any model)
Wan Lin	NS: downsampling is not useful [5] share speaker meeting in Friday
Tianhao Wang	AudioSep (CLAP) 5-mix exps[6]: text-query: SDR=4.978, SI-SDR=1.972 audio-query: SDR=6.907, SI-SDR=5.058 This results with the loudness limitation	AudioSep (CLAP) without loudness limitation Project things
Xiaoxue Luo	Comparative experiment between AudioSep and baseline system(CLIPSep) Prepare the report
Zhenyu Zhou	reproduce 5-mix speech Separation results： pit：2-mix：16.04 ；5-mix：6.87 conditional：5-mix：5.38（40 epoch）
Junhui Chen	NS：speaker detection （method survey & debug） get sick
Jiaying Wang
Yu Zhang	SocioDojo (still worse than Nasdaq100 baseline) Change information sources, from the perspective of the report generated by LLM, more new information sources will be referenced. Prompt Actuator to consider current cash ratio before investing (with out this, the asset ratio goes up to 100%, which leads to high risks, still running) Read some papers about integrating time series into LLM
Wenqiang Du	Prepare data,code and environment for Pro.Mijiti
Yang Wei	Train text enroll KWS model with Aibabel training data. Not work.
Lily
Turi	Whisper-largev3 finetuning Freezing 20 layers of encoder achieved 9.75 WER. Vanilla finetuning 8.02 WER
Yue Gu	seek suggestions from other authors. Many suggestions are conflicting, so I'm try to figure out the reasons and fix these issues.
Qi Qu	KWS: Text-enroll models exported to ONNX. C/JNI libs built based on ONNX models and ready for on-device test.

@@ 第17行： / 第17行： @@
 |Lantian Li
 ||
-*
+* AI-Graph EN (1-20 finalized)
+* Design 2025 Daily Posts
 ||
 *
@@ 第30行： / 第31行： @@
 * revise the code about cohort-overlap asr [the training is in progress]
 ** Support arbitrary source mixing training
-** Use the real hypothesis by Token error rate
+** Use the real hypothesis as condition by Token error rate
 ** Design stop criterion
 ||
@@ 第114行： / 第115行： @@
 |Wan Lin
 ||
-* NS: downsampling is not useful
+* NS: downsampling is not useful [https://z1et6d3xtb.feishu.cn/docx/MxBNdPbLao0tsoxkBVCcUgUoneh?from=from_copylink]
 * share speaker meeting in Friday
 ||
@@ 第126行： / 第127行： @@
 |Tianhao Wang
 ||
-* AudioSep (CLAP) 5-mix exps:
+* AudioSep (CLAP) 5-mix exps[https://z1et6d3xtb.feishu.cn/docx/DlR8dZRdEoZIwIxTOFvcQdbGnqg]:
 ** text-query: SDR=4.978, SI-SDR=1.972
 ** audio-query: SDR=6.907, SI-SDR=5.058

“2024-10-28”版本间的差异

2024年10月28日 (一) 10:59的最后版本

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具