“2024-10-14”版本间的差异
来自cslt Wiki
(8位用户的13个中间修订版本未显示) | |||
第6行: | 第6行: | ||
|Dong Wang | |Dong Wang | ||
|| | || | ||
− | * | + | * AI handbook high-education version, experiment booklet |
+ | * Check AI primary school handbook (1-20) | ||
|| | || | ||
* | * | ||
第17行: | 第18行: | ||
|Lantian Li | |Lantian Li | ||
|| | || | ||
− | * | + | * AI-Graph EN (20/50) |
+ | * Prepare CSTR intro report | ||
|| | || | ||
* | * | ||
第28行: | 第30行: | ||
|Ying Shi | |Ying Shi | ||
|| | || | ||
− | * | + | * Finish Text enroll keywords spotting code & document and deliver to Wei & Du |
+ | * Cohort Overlap ASR code v0.0 | ||
+ | ** code has finished and training has been done | ||
+ | * Cohort Speech separation code v0.0 | ||
+ | ** code has finished training is in progress | ||
+ | * [https://z1et6d3xtb.feishu.cn/docx/OHjsdgVmhoXUGpxvh5tcaBN4nAh?from=from_copylink here] | ||
|| | || | ||
* | * | ||
第39行: | 第46行: | ||
|Zhenghai You | |Zhenghai You | ||
|| | || | ||
− | * | + | * Exploring the role of speaker encoder in TSE and generality of SPK-AUG[https://z1et6d3xtb.feishu.cn/docx/GHF8doRjDo50ihxGUPpcsZgLncb?from=space_home_recent&pre_pathname=%2Fdrive%2Fhome%2F&previous_navigation_time=1728902573829] |
|| | || | ||
* | * | ||
第75行: | 第82行: | ||
|Xiaolou Li | |Xiaolou Li | ||
|| | || | ||
− | * | + | * AV-HuBERT discrete unit training (wer: ↓1.5-3%) |
+ | ** rethink how to prove the advantage or disadvantage of discrete unit? | ||
+ | * Dense connector experiments (in training) | ||
+ | * Double check the data of existing 3000h data in CVS2 | ||
+ | * Paper reading (discrete unit, VTS) | ||
|| | || | ||
− | * | + | * Design a experiment to explain the performance of discrete unit |
+ | * Finish data double check | ||
+ | * Try to establish a simple VTS system based on our VSR system | ||
|| | || | ||
* | * | ||
第86行: | 第99行: | ||
|Zehua Liu | |Zehua Liu | ||
|| | || | ||
− | * | + | *Av-Hubert(Frozen) as Encoder performe very bad(cer:80%)[https://z1et6d3xtb.feishu.cn/docx/JBsidACDVojhCaxFQLbcCVbsnAc?from=from_copylink] |
+ | **after finetune maybe better ,but still bad | ||
+ | *Qwen-14B perform better(47%) than Qwen-7B(50%) | ||
+ | *Finish In-Context-Learning code and is training | ||
+ | ** maybe i will get result very soon | ||
|| | || | ||
− | * | + | *verify collected data with XiaoLou |
+ | *finish VTS data Acceptance report | ||
|| | || | ||
* | * | ||
第109行: | 第127行: | ||
|Wan Lin | |Wan Lin | ||
|| | || | ||
− | * | + | * NS |
+ | ** poster | ||
+ | ** data preparing and processing | ||
+ | ** adjust the training code | ||
|| | || | ||
* | * | ||
第120行: | 第141行: | ||
|Tianhao Wang | |Tianhao Wang | ||
|| | || | ||
− | * | + | * CLIPSep exps for 2-mix and 5-mix [https://z1et6d3xtb.feishu.cn/docx/DnJgdwtNhotEpIxH7zfcksETnte] |
+ | ** 2-mix(whole vggsound, 300 classes): SDR-mix = -1.1748, SDR-separate = 5.0145 | ||
+ | ** 5-mix(50 classes of vggsound): SDR-mix = -11.4529, SDR-separate = -0.4764 | ||
|| | || | ||
* | * | ||
第144行: | 第167行: | ||
|Zhenyu Zhou | |Zhenyu Zhou | ||
|| | || | ||
− | * | + | *Model quantization version2 |
+ | *Multi-talker mix data preparation | ||
|| | || | ||
* | * | ||
第155行: | 第179行: | ||
|Junhui Chen | |Junhui Chen | ||
|| | || | ||
− | * | + | * Prepare vb2 data |
+ | ** Too many utterances for training (out of memory), thinking a smart way to divide them. | ||
|| | || | ||
* | * |
2024年10月14日 (一) 11:02的最后版本
People | This Week | Next Week | Task Tracking (DeadLine) |
---|---|---|---|
Dong Wang |
|
|
|
Lantian Li |
|
|
|
Ying Shi |
|
|
|
Zhenghai You |
|
|
|
Junming Yuan |
|
|
|
Chen Chen |
|
|
|
Xiaolou Li |
|
|
|
Zehua Liu |
|
|
|
Pengqi Li |
|
|
|
Wan Lin |
|
|
|
Tianhao Wang |
|
|
|
Xiaoxue Luo |
|
|
|
Zhenyu Zhou |
|
|
|
Junhui Chen |
|
|
|
Jiaying Wang |
|
|
|
Yu Zhang |
|
|
|
Wenqiang Du |
|
|
|
Yang Wei |
|
|
|
Lily |
|
|
|
Turi |
|
|
|
Yue Gu |
|
|
|
Qi Qu |
|
|
|