People |
This Week |
Next Week |
Task Tracking (DeadLine)
|
Dong Wang
|
- Primary School AI hand book (20-30)
|
|
|
Lantian Li
|
- AI-Graph EN (25/50)
- Complete CSTR intro report (11.18)
|
|
|
Ying Shi
|
- Cohort-Overlap ASR
- condition on real decode result
- Design stop criterion
- Cohort-Speech separation
- several configs for Dual-path model
- group work
|
|
|
Zhenghai You
|
|
|
|
Junming Yuan
|
- The result of feat-mask/time-mask MT-HuBERT [1]
|
|
|
Xiaolou Li
|
- AVHuBERT unit exp
- dc connector (↑0.8% than discrete unit)
- concat feature and embedding (↑2% than discrete unit, ↓0.3% than baseline)
- CVS3 quality check (30h totally) [2]
- This work is help by Zehua, Linwan, Tianhao
- MLLM system with audio output design
|
|
|
Zehua Liu
|
- Verify VSR data
- Finish Data Verification Report
- ICL work(CER: 47.87% < CER: 51.08%)
- Time Mask matters[3]
|
|
|
Pengqi Li
|
- Complete the final report of the doctoral innovation project(School)
- Exploring the Consistency of TAO and LayerCAM Results on different models and datasets.
- Conclusion and hypothesis[4]
|
|
|
Wan Lin
|
- help VSR data verification
- experiment in voxblink2 [5]
|
|
|
Tianhao Wang
|
- adjust the code of AudioSep (CLAP) to support multi-mix and audio-query (in training)
- some project testing
|
|
|
Xiaoxue Luo
|
- AudioSep reproduction
- evaluate the performance of AudioSep
- comparative experiment between AudioSep and baseline system(CLIPSep)
|
|
|
Zhenyu Zhou
|
- conditional chain 2-mix results reproduction(sisidr: 10.714 -> 15.6)
- model quantization finial version submission
|
|
|
Junhui Chen
|
- Experiments for NS
- Look for speaker detection model with Resnet34 for frame label
|
|
|
Jiaying Wang
|
|
|
|
Yu Zhang
|
- SocioDojo Llama 3.1 8B investment task
- acc return is about 10% below nasdaq 100 index
|
- add more professional information source, such as WSJ (current is Tweets Trending, which is too entertainment-oriented)
- control the BUY/SELL amount of Actuator (current investments ratio is too high)
- reproduce other Multi Agent investment pipeline such as FinAgent or FinRobot
|
|
Wenqiang Du
|
- Participated in an AI competition
|
|
|
Yang Wei
|
- Train text enroll KWS model and test with Aibabel dialect data.
|
|
|
Lily
|
|
|
|
Turi
|
- Whisper finetuning on sagalee
- with encoder frozen, whisper-large-v3 (20.5 WER)
- Finetuning LLM
- Finetuned Qwen2.5-0.5B on conversation dataset translated from English to Oromo
-
|
|
Yue Gu
|
- write the cover letter
- design a new speaker adaptation framework
|
|
|
Qi Qu
|
- AED:
- New CED-based classifiers deployed onto devices, yielding acceptable performance.
- KWS:
- Quantization and format conversion of production models for deployment on embedded device w/ NPU. Default quantization mode leads to unacceptable loss of precision. Will try hybrid quantization.
- Text-enrollment KWS: some dynamic dimensions misinterpreted as constant duration exportation to ONNX.
|
|
|