“2024-10-28”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(4位用户的4个中间修订版本未显示)
第17行: 第17行:
 
|Lantian Li
 
|Lantian Li
 
||
 
||
*
+
* AI-Graph EN (1-20 finalized)
 +
* Design 2025 Daily Posts
 
||
 
||
 
*
 
*
第30行: 第31行:
 
* revise the code about cohort-overlap asr [the training is in progress]
 
* revise the code about cohort-overlap asr [the training is in progress]
 
** Support arbitrary source mixing training  
 
** Support arbitrary source mixing training  
** Use the real hypothesis by Token error rate
+
** Use the real hypothesis as condition by Token error rate
 +
** Design stop criterion
 
||
 
||
 
*
 
*
第113行: 第115行:
 
|Wan Lin
 
|Wan Lin
 
||
 
||
* NS: downsampling is not useful
+
* NS: downsampling is not useful [https://z1et6d3xtb.feishu.cn/docx/MxBNdPbLao0tsoxkBVCcUgUoneh?from=from_copylink]
 
* share speaker meeting in Friday
 
* share speaker meeting in Friday
 
||
 
||
第125行: 第127行:
 
|Tianhao Wang
 
|Tianhao Wang
 
||
 
||
* AudioSep (CLAP) 5-mix exps:
+
* AudioSep (CLAP) 5-mix exps[https://z1et6d3xtb.feishu.cn/docx/DlR8dZRdEoZIwIxTOFvcQdbGnqg]:
 
** text-query: SDR=4.978, SI-SDR=1.972
 
** text-query: SDR=4.978, SI-SDR=1.972
 
** audio-query: SDR=6.907, SI-SDR=5.058
 
** audio-query: SDR=6.907, SI-SDR=5.058

2024年10月28日 (一) 10:59的最后版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • AI primary book done
Lantian Li
  • AI-Graph EN (1-20 finalized)
  • Design 2025 Daily Posts
Ying Shi
  • revise the code about cohort-overlap asr [the training is in progress]
    • Support arbitrary source mixing training
    • Use the real hypothesis as condition by Token error rate
    • Design stop criterion
Zhenghai You
  • Introduce more hard samples to improve model performance[1]
    • SPK-AUG with same length: There is an improvement, but the SI-SDR decreases when hard sample rate increases
    • Design more hard samples
Junming Yuan
  • The result of time-mask MT-HuBERT [2]
    • A sad news
Chen Chen
Xiaolou Li
  • VTS with LLM structure design and baseline code writing [3]
Zehua Liu
  • Reading Papper about In-Context-Learning in ASR
  • Training model with Adaptive Time Mask
  • Try In-Context-Learning with only previous sentence[4]
  • VTS Project Report starts
Pengqi Li
  • Consistency of TAO and LayerCAM
    • Change TAO from input to final conv layer and obtain more consistency.(Aishell:0.93 in any model)
Wan Lin
  • NS: downsampling is not useful [5]
  • share speaker meeting in Friday
Tianhao Wang
  • AudioSep (CLAP) 5-mix exps[6]:
    • text-query: SDR=4.978, SI-SDR=1.972
    • audio-query: SDR=6.907, SI-SDR=5.058
    • This results with the loudness limitation
  • AudioSep (CLAP) without loudness limitation
  • Project things
Xiaoxue Luo
  • Comparative experiment between AudioSep and baseline system(CLIPSep)
  • Prepare the report
Zhenyu Zhou
  • reproduce 5-mix speech Separation results:
    • pit:2-mix:16.04 ;5-mix:6.87
    • conditional:5-mix:5.38(40 epoch)
Junhui Chen
  • NS:speaker detection (method survey & debug)
  • get sick
Jiaying Wang
Yu Zhang
  • SocioDojo (still worse than Nasdaq100 baseline)
    • Change information sources, from the perspective of the report generated by LLM, more new information sources will be referenced.
    • Prompt Actuator to consider current cash ratio before investing (with out this, the asset ratio goes up to 100%, which leads to high risks, still running)
  • Read some papers about integrating time series into LLM
Wenqiang Du
  • Prepare data,code and environment for Pro.Mijiti
Yang Wei
  • Train text enroll KWS model with Aibabel training data. Not work.
Lily
Turi
  • Whisper-largev3 finetuning
    • Freezing 20 layers of encoder achieved 9.75 WER. Vanilla finetuning 8.02 WER
Yue Gu
  • seek suggestions from other authors. Many suggestions are conflicting, so I'm try to figure out the reasons and fix these issues.
Qi Qu
  • KWS:
    • Text-enroll models exported to ONNX.
    • C/JNI libs built based on ONNX models and ready for on-device test.