“2024-11-11”版本间的差异
来自cslt Wiki
(13位用户的16个中间修订版本未显示) | |||
第17行: | 第17行: | ||
|Lantian Li | |Lantian Li | ||
|| | || | ||
− | * | + | * Complete all the script for the 2025 AI calendar |
+ | * AI-Graph EN (32/50) | ||
|| | || | ||
* | * | ||
第39行: | 第40行: | ||
|Zhenghai You | |Zhenghai You | ||
|| | || | ||
− | * | + | * Huawei project with IRA-TSE[https://z1et6d3xtb.feishu.cn/docx/R05DdrPVqoSzQYxNlhicedxenkd] |
|| | || | ||
* | * | ||
第50行: | 第51行: | ||
|| | || | ||
* re-check some details from Cocktail HuBERT paper and prepared the code. | * re-check some details from Cocktail HuBERT paper and prepared the code. | ||
+ | **pseudo-label preparation finished. | ||
* paper reading | * paper reading | ||
|| | || | ||
第61行: | 第63行: | ||
|Xiaolou Li | |Xiaolou Li | ||
|| | || | ||
− | * | + | * Finish VTS documents with Zehua |
+ | * Process the CVS3 data | ||
+ | * Inherit the AV-HuBERT training code and debug | ||
|| | || | ||
* | * | ||
第72行: | 第76行: | ||
|Zehua Liu | |Zehua Liu | ||
|| | || | ||
− | * | + | *Finish 2 VTS documents with Xiaolou |
+ | **Financial Document | ||
+ | **Technical Document | ||
+ | *Paper Reading on last Friday | ||
|| | || | ||
* | * | ||
第83行: | 第90行: | ||
|Pengqi Li | |Pengqi Li | ||
|| | || | ||
− | * | + | * Analyze the distribution of phoneme importance(PID) in the TIMIT dataset based on more SOTA models(TDNN 4.4% , ECAPA:2.8%). |
+ | ** Conclusions still need to be further analyzed in conjunction with other databases.[https://z1et6d3xtb.feishu.cn/docx/VtlIdFxdRodp8Nx8oQjcVLC4nCd] | ||
|| | || | ||
* | * | ||
第94行: | 第102行: | ||
|Wan Lin | |Wan Lin | ||
|| | || | ||
− | * | + | * NS: detection |
+ | ** clean: 1.479% EER vs. 1.239% EER | ||
+ | ** multi: in training | ||
|| | || | ||
* | * | ||
第104行: | 第114行: | ||
|- | |- | ||
|Tianhao Wang | |Tianhao Wang | ||
+ | || | ||
+ | * ablation study about some new approach for sound separation [https://z1et6d3xtb.feishu.cn/docx/NLlsdyLtuoptjYxjcX0cwlVbnXc] | ||
|| | || | ||
* | * | ||
+ | || | ||
+ | * | ||
+ | |- | ||
+ | |||
+ | |||
+ | |- | ||
+ | |Xiaoxue Luo | ||
+ | || | ||
+ | * paper reading to investigate some new approach for sound separation | ||
+ | * retrain AudioSep with a DPRNN block(AudioSep-DP) | ||
|| | || | ||
* | * | ||
第128行: | 第150行: | ||
|Junhui Chen | |Junhui Chen | ||
|| | || | ||
− | * | + | * VAD frame level detection loss |
+ | ** Loss decreases faster in the early stages of training | ||
+ | * Change test encoder: from resnet34 to transformer encoder (coding...) | ||
|| | || | ||
* | * | ||
第163行: | 第187行: | ||
|Wenqiang Du | |Wenqiang Du | ||
|| | || | ||
− | * Training of New | + | * Training of New language Models(Cantonese) |
* Prepare the PPT for the competition | * Prepare the PPT for the competition | ||
|| | || | ||
第175行: | 第199行: | ||
|Yang Wei | |Yang Wei | ||
|| | || | ||
− | * | + | * Train text enroll KWS model with 7000h data |
|| | || | ||
* | * | ||
第195行: | 第219行: | ||
|Turi | |Turi | ||
|| | || | ||
− | * | + | * kws data preparation and checking some implementations |
* Paper Reading about kws | * Paper Reading about kws | ||
− | |||
|| | || | ||
* | * | ||
第216行: | 第239行: | ||
|Qi Qu | |Qi Qu | ||
|| | || | ||
− | * | + | * KWS: |
+ | ** Yi (Liangshan, Sichuan) test dataset annotated and finalized. Optimal thresholds for predefined scenes. Cloud model service deployed. | ||
+ | ** Quantization for NPU with more calibration data (6k): mean_loss=1.3e-4, max_loss=6.2e-2. | ||
+ | ** NPU demo: feature extraction + model inference. | ||
+ | ** Text-enroll method: android demo benchmark. | ||
|| | || | ||
* | * |
2024年11月11日 (一) 11:05的最后版本
People | This Week | Next Week | Task Tracking (DeadLine) |
---|---|---|---|
Dong Wang |
|
|
|
Lantian Li |
|
|
|
Ying Shi |
|
|
|
Zhenghai You |
|
|
|
Junming Yuan |
|
|
|
Xiaolou Li |
|
|
|
Zehua Liu |
|
|
|
Pengqi Li |
|
|
|
Wan Lin |
|
|
|
Tianhao Wang |
|
|
|
Xiaoxue Luo |
|
|
|
Zhenyu Zhou |
|
|
|
Junhui Chen |
|
|
|
Jiaying Wang |
|
|
|
Yu Zhang |
|
|
|
Wenqiang Du |
|
|
|
Yang Wei |
|
|
|
Lily |
|
|
|
Turi |
|
|
|
Yue Gu |
|
|
|
Qi Qu |
|
|
|