“2024-10-21”版本间的差异
来自cslt Wiki
Duwenqiang(讨论 | 贡献) |
|||
(13位用户的16个中间修订版本未显示) | |||
第6行: | 第6行: | ||
|Dong Wang | |Dong Wang | ||
|| | || | ||
− | * | + | * Primary School AI hand book (20-30) |
|| | || | ||
* | * | ||
第29行: | 第29行: | ||
|Ying Shi | |Ying Shi | ||
|| | || | ||
− | * | + | * Cohort-Overlap ASR |
+ | ** condition on real decode result | ||
+ | ** Design stop criterion | ||
+ | * Cohort-Speech separation | ||
+ | ** several configs for Dual-path model | ||
+ | * [https://z1et6d3xtb.feishu.cn/docx/X3C4dFRoKo3PATxvXWUc40LOnye?from=from_copylink group work ] | ||
|| | || | ||
* | * | ||
第40行: | 第45行: | ||
|Zhenghai You | |Zhenghai You | ||
|| | || | ||
− | * | + | * Weekly report |
|| | || | ||
* | * | ||
第61行: | 第66行: | ||
|Xiaolou Li | |Xiaolou Li | ||
|| | || | ||
− | * | + | * AVHuBERT unit exp |
+ | ** dc connector (↑0.8% than discrete unit) | ||
+ | ** concat feature and embedding (↑2% than discrete unit, ↓0.3% than baseline) | ||
+ | * CVS3 quality check (30h totally) [https://z1et6d3xtb.feishu.cn/drive/folder/HGHbfyCJRlLYzUdSlEicOEztnYc] | ||
+ | * This work is help by Zehua, Linwan, Tianhao | ||
+ | * MLLM system with audio output design | ||
|| | || | ||
* | * | ||
第72行: | 第82行: | ||
|Zehua Liu | |Zehua Liu | ||
|| | || | ||
− | * | + | *Verify VSR data |
+ | *Finish Data Verification Report | ||
+ | *ICL work(CER: 47.87% < CER: 51.08%) | ||
+ | *Time Mask matters[https://z1et6d3xtb.feishu.cn/docx/JBsidACDVojhCaxFQLbcCVbsnAc?from=from_copylink] | ||
|| | || | ||
* | * | ||
第96行: | 第109行: | ||
|Wan Lin | |Wan Lin | ||
|| | || | ||
− | * | + | * help VSR data verification |
+ | * experiment in voxblink2 [https://z1et6d3xtb.feishu.cn/docx/MxBNdPbLao0tsoxkBVCcUgUoneh?from=from_copylink] | ||
|| | || | ||
* | * | ||
第107行: | 第121行: | ||
|Tianhao Wang | |Tianhao Wang | ||
|| | || | ||
− | * | + | * adjust the code of AudioSep (CLAP) to support multi-mix and audio-query (in training) |
+ | * some project testing | ||
|| | || | ||
* | * | ||
第120行: | 第135行: | ||
*AudioSep reproduction | *AudioSep reproduction | ||
**evaluate the performance of AudioSep | **evaluate the performance of AudioSep | ||
− | ** | + | **comparative experiment between AudioSep and baseline system(CLIPSep) |
***adjusting the code | ***adjusting the code | ||
|| | || | ||
第132行: | 第147行: | ||
|Zhenyu Zhou | |Zhenyu Zhou | ||
|| | || | ||
− | * | + | *conditional chain 2-mix results reproduction(sisidr: 10.714 -> 15.6) |
+ | *model quantization finial version submission | ||
|| | || | ||
* | * | ||
第143行: | 第159行: | ||
|Junhui Chen | |Junhui Chen | ||
|| | || | ||
− | * | + | * Experiments for NS |
+ | * Look for speaker detection model with Resnet34 for frame label | ||
|| | || | ||
* | * | ||
第165行: | 第182行: | ||
|Yu Zhang | |Yu Zhang | ||
|| | || | ||
− | * | + | * SocioDojo Llama 3.1 8B investment task |
+ | ** acc return is about 10% below nasdaq 100 index | ||
|| | || | ||
− | * | + | * add more professional information source, such as WSJ (current is Tweets Trending, which is too entertainment-oriented) |
+ | * control the BUY/SELL amount of Actuator (current investments ratio is too high) | ||
+ | * reproduce other Multi Agent investment pipeline such as FinAgent or FinRobot | ||
|| | || | ||
* | * | ||
第187行: | 第207行: | ||
|Yang Wei | |Yang Wei | ||
|| | || | ||
− | * | + | * Train text enroll KWS model and test with Aibabel dialect data. |
|| | || | ||
* | * | ||
第207行: | 第227行: | ||
|Turi | |Turi | ||
|| | || | ||
− | * | + | * Whisper finetuning on sagalee |
− | + | ** with encoder frozen, whisper-large-v3 (20.5 WER) | |
+ | * Finetuning LLM | ||
+ | ** Finetuned Qwen2.5-0.5B on conversation dataset translated from English to Oromo | ||
* | * | ||
|| | || | ||
第215行: | 第237行: | ||
|Yue Gu | |Yue Gu | ||
|| | || | ||
− | * | + | * write the cover letter |
+ | * design a new speaker adaptation framework | ||
|| | || | ||
* | * | ||
第224行: | 第247行: | ||
|Qi Qu | |Qi Qu | ||
|| | || | ||
− | * | + | * AED: |
+ | ** New CED-based classifiers deployed onto devices, yielding acceptable performance. | ||
+ | * KWS: | ||
+ | ** Quantization and format conversion of production models for deployment on embedded device w/ NPU. Default quantization mode leads to unacceptable loss of precision. Will try hybrid quantization. | ||
+ | ** Text-enrollment KWS: some dynamic dimensions misinterpreted as constant duration exportation to ONNX. | ||
|| | || | ||
* | * |
2024年10月21日 (一) 11:01的最后版本
People | This Week | Next Week | Task Tracking (DeadLine) |
---|---|---|---|
Dong Wang |
|
|
|
Lantian Li |
|
|
|
Ying Shi |
|
|
|
Zhenghai You |
|
|
|
Junming Yuan |
|
|
|
Xiaolou Li |
|
|
|
Zehua Liu |
|
|
|
Pengqi Li |
|
|
|
Wan Lin |
|
|
|
Tianhao Wang |
|
|
|
Xiaoxue Luo |
|
|
|
Zhenyu Zhou |
|
|
|
Junhui Chen |
|
|
|
Jiaying Wang |
|
|
|
Yu Zhang |
|
|
|
Wenqiang Du |
|
|
|
Yang Wei |
|
|
|
Lily |
|
|
|
Turi |
|
| |
Yue Gu |
|
|
|
Qi Qu |
|
|
|