|
|
(3位用户的3个中间修订版本未显示) |
第17行: |
第17行: |
| |Lantian Li | | |Lantian Li |
| || | | || |
− | * | + | * AI-Graph EN (1-20 finalized) |
| + | * Design 2025 Daily Posts |
| || | | || |
| * | | * |
第30行: |
第31行: |
| * revise the code about cohort-overlap asr [the training is in progress] | | * revise the code about cohort-overlap asr [the training is in progress] |
| ** Support arbitrary source mixing training | | ** Support arbitrary source mixing training |
− | ** Use the real hypothesis by Token error rate | + | ** Use the real hypothesis as condition by Token error rate |
| ** Design stop criterion | | ** Design stop criterion |
| || | | || |
第114行: |
第115行: |
| |Wan Lin | | |Wan Lin |
| || | | || |
− | * NS: downsampling is not useful | + | * NS: downsampling is not useful [https://z1et6d3xtb.feishu.cn/docx/MxBNdPbLao0tsoxkBVCcUgUoneh?from=from_copylink] |
| * share speaker meeting in Friday | | * share speaker meeting in Friday |
| || | | || |
第126行: |
第127行: |
| |Tianhao Wang | | |Tianhao Wang |
| || | | || |
− | * AudioSep (CLAP) 5-mix exps: | + | * AudioSep (CLAP) 5-mix exps[https://z1et6d3xtb.feishu.cn/docx/DlR8dZRdEoZIwIxTOFvcQdbGnqg]: |
| ** text-query: SDR=4.978, SI-SDR=1.972 | | ** text-query: SDR=4.978, SI-SDR=1.972 |
| ** audio-query: SDR=6.907, SI-SDR=5.058 | | ** audio-query: SDR=6.907, SI-SDR=5.058 |
People |
This Week |
Next Week |
Task Tracking (DeadLine)
|
Dong Wang
|
|
|
|
Lantian Li
|
- AI-Graph EN (1-20 finalized)
- Design 2025 Daily Posts
|
|
|
Ying Shi
|
- revise the code about cohort-overlap asr [the training is in progress]
- Support arbitrary source mixing training
- Use the real hypothesis as condition by Token error rate
- Design stop criterion
|
|
|
Zhenghai You
|
- Introduce more hard samples to improve model performance[1]
- SPK-AUG with same length: There is an improvement, but the SI-SDR decreases when hard sample rate increases
- Design more hard samples
|
|
|
Junming Yuan
|
- The result of time-mask MT-HuBERT [2]
|
|
|
Chen Chen
|
|
|
|
Xiaolou Li
|
- VTS with LLM structure design and baseline code writing [3]
|
|
|
Zehua Liu
|
- Reading Papper about In-Context-Learning in ASR
- Training model with Adaptive Time Mask
- Try In-Context-Learning with only previous sentence[4]
- VTS Project Report starts
|
|
|
Pengqi Li
|
- Consistency of TAO and LayerCAM
- Change TAO from input to final conv layer and obtain more consistency.(Aishell:0.93 in any model)
|
|
|
Wan Lin
|
- NS: downsampling is not useful [5]
- share speaker meeting in Friday
|
|
|
Tianhao Wang
|
- AudioSep (CLAP) 5-mix exps[6]:
- text-query: SDR=4.978, SI-SDR=1.972
- audio-query: SDR=6.907, SI-SDR=5.058
- This results with the loudness limitation
|
- AudioSep (CLAP) without loudness limitation
- Project things
|
|
Xiaoxue Luo
|
- Comparative experiment between AudioSep and baseline system(CLIPSep)
- Prepare the report
|
|
|
Zhenyu Zhou
|
- reproduce 5-mix speech Separation results:
- pit:2-mix:16.04 ;5-mix:6.87
- conditional:5-mix:5.38(40 epoch)
|
|
|
Junhui Chen
|
- NS:speaker detection (method survey & debug)
- get sick
|
|
|
Jiaying Wang
|
|
|
|
Yu Zhang
|
- SocioDojo (still worse than Nasdaq100 baseline)
- Change information sources, from the perspective of the report generated by LLM, more new information sources will be referenced.
- Prompt Actuator to consider current cash ratio before investing (with out this, the asset ratio goes up to 100%, which leads to high risks, still running)
- Read some papers about integrating time series into LLM
|
|
|
Wenqiang Du
|
- Prepare data,code and environment for Pro.Mijiti
|
|
|
Yang Wei
|
- Train text enroll KWS model with Aibabel training data. Not work.
|
|
|
Lily
|
|
|
|
Turi
|
- Whisper-largev3 finetuning
- Freezing 20 layers of encoder achieved 9.75 WER. Vanilla finetuning 8.02 WER
|
|
|
Yue Gu
|
- seek suggestions from other authors. Many suggestions are conflicting, so I'm try to figure out the reasons and fix these issues.
|
|
|
Qi Qu
|
- KWS:
- Text-enroll models exported to ONNX.
- C/JNI libs built based on ONNX models and ready for on-device test.
|
|
|