“2024-11-25”版本间的差异
来自cslt Wiki
(16位用户的25个中间修订版本未显示) | |||
第6行: | 第6行: | ||
|Dong Wang | |Dong Wang | ||
|| | || | ||
− | * | + | * 2nd round of check for AI handbook middle school. |
+ | * Deal with pictures in AI handbook (primary & middle). | ||
+ | * Start to check AI handbook high school. | ||
+ | * Check AI book for Tianjin medical school. | ||
|| | || | ||
* | * | ||
第30行: | 第33行: | ||
|Ying Shi | |Ying Shi | ||
|| | || | ||
− | * | + | * Design cohort- conditional chain multi-talker ASR with round-RNN |
+ | ** WER result : round-1 32.15% , round-2: 69.69% round-3: 92.33% | ||
+ | ** For 500 utterances sub-test set: Only 28% of the sentences have a recognition order that matches the cosine distance. | ||
+ | * Prepare for Huawei's interview. | ||
|| | || | ||
* | * | ||
第41行: | 第47行: | ||
|Zhenghai You | |Zhenghai You | ||
|| | || | ||
− | * | + | * Huawei TSE(Train models that better fit the scene)[https://z1et6d3xtb.feishu.cn/docx/AArOdQEQPoFcshxD5OfcB9SLnFg] |
|| | || | ||
* | * | ||
第51行: | 第57行: | ||
|Junming Yuan | |Junming Yuan | ||
|| | || | ||
− | * | + | * Comparable results between Clean-HuBERT, Cocktail-HuBERT, and MT-HuBERT[https://z1et6d3xtb.feishu.cn/docx/YhJadT52mokvPQxfV3qcmtlwnkb] |
+ | ** Bad news: Cocktail-HuBERT > Clean-HuBERT > MT-HuBERT | ||
|| | || | ||
* | * | ||
第73行: | 第80行: | ||
|Xiaolou Li | |Xiaolou Li | ||
|| | || | ||
− | * | + | * Data process |
+ | ** CVS3 1/4 already cut from original video, waiting for pre-process | ||
+ | ** Copying pre-processed GongAn video data from gonganbu | ||
+ | * VSR Contrastive Loss Exp | ||
+ | ** Inspired by paper [https://arxiv.org/abs/2408.11813] | ||
+ | ** Main idea: For better align visual feature to LLM input, calculate cos similarity of target and video feature, set the biggest as the positive pair. | ||
+ | ** Result: Under training | ||
+ | * Paper Reading | ||
|| | || | ||
* | * | ||
第84行: | 第98行: | ||
|Zehua Liu | |Zehua Liu | ||
|| | || | ||
− | * | + | *Rebutall writing |
+ | *Iterative training and inference | ||
+ | **Iter-1(45.53%) < Iter-2(45.00%) < Iter-3(44.85%) | ||
|| | || | ||
* | * | ||
第95行: | 第111行: | ||
|Pengqi Li | |Pengqi Li | ||
|| | || | ||
− | * | + | * Begin writing paper about importance of phonemes analysis work. |
+ | * Reading a doctoral thesis about speaker explainability[https://theses.hal.science/tel-04634215v1/file/These_BEN_AMOR.pdf]. | ||
+ | |||
|| | || | ||
* | * | ||
第106行: | 第124行: | ||
|Wan Lin | |Wan Lin | ||
|| | || | ||
− | * | + | * NS: all transformer |
+ | ** 6k spk: EER 2.6% | ||
+ | ** 20k spk: EER 2.3% | ||
+ | ** 20k spk+multi-enroll: EER 1.9% | ||
|| | || | ||
* | * | ||
第116行: | 第137行: | ||
|- | |- | ||
|Tianhao Wang | |Tianhao Wang | ||
+ | || | ||
+ | * Experiments about query embedding conditional approach: | ||
+ | ** SDR: FiLM (7.492) > self-attention (6.573) | ||
|| | || | ||
* | * | ||
+ | || | ||
+ | * | ||
+ | |- | ||
+ | |||
+ | |||
+ | |- | ||
+ | |Xiaoxue Luo | ||
+ | || | ||
+ | * training of the USS(CED+AudioSep) model | ||
+ | ** adjust the audio format to meet the needs of the model(in training) | ||
+ | * production of 2025 Daily Sign( March ) | ||
|| | || | ||
* | * | ||
第128行: | 第163行: | ||
|Zhenyu Zhou | |Zhenyu Zhou | ||
|| | || | ||
− | * | + | *Speaker identity based conditional chain proposal[https://z1et6d3xtb.feishu.cn/docx/MzZ8d3cDWokCzCx0MmDcRJDFnke] |
+ | *prepare Interim Report | ||
|| | || | ||
* | * | ||
第139行: | 第175行: | ||
|Junhui Chen | |Junhui Chen | ||
|| | || | ||
− | * | + | * Read paper (ICCIP keynote speak paper and some other) |
+ | * NS | ||
+ | ** Some tests about transformer feature extractor | ||
|| | || | ||
* | * | ||
第161行: | 第199行: | ||
|Yu Zhang | |Yu Zhang | ||
|| | || | ||
− | * | + | * Huawei AED |
+ | ** data aug & human annotated dataset [https://z1et6d3xtb.feishu.cn/wiki/AO2CwQC4gioaq6k1SkkcARBAn2f] | ||
+ | * Finance | ||
+ | ** Paper reading, reproduce local Llama version of StockAgent [https://github.com/MingyuJ666/Stockagent] (a LLM based market simulation framework) | ||
|| | || | ||
* | * | ||
第172行: | 第213行: | ||
|Wenqiang Du | |Wenqiang Du | ||
|| | || | ||
− | * | + | *Training of New language Models(HeNan)[https://z1et6d3xtb.feishu.cn/docx/R7uIdGnwBo69bqxXki8cn50cnfh?from=from_copylink] |
+ | *Training of New language Models(ChongQing)[https://z1et6d3xtb.feishu.cn/docx/FioOdh8Uqo83oCxAJRzcGXcRnae?from=from_copylink] | ||
+ | |||
|| | || | ||
* | * | ||
第183行: | 第226行: | ||
|Yang Wei | |Yang Wei | ||
|| | || | ||
− | * | + | * Fix some bugs about keyword sampling in text enroll kws training code. |
+ | * Add spec augmentation for text enroll kws training. | ||
|| | || | ||
* | * | ||
第203行: | 第247行: | ||
|Turi | |Turi | ||
|| | || | ||
− | * | + | * Paper reading |
+ | * ICASSP 2025 rebuttal | ||
|| | || | ||
* | * | ||
第211行: | 第256行: | ||
|Yue Gu | |Yue Gu | ||
|| | || | ||
− | * | + | * Synthesis about 1h data for each target speaker, then using these data to train the adapter module.[https://z1et6d3xtb.feishu.cn/wiki/VPZfwx53ei2zkgkSvPtcCiDSnVh?from=from_copylink] |
+ | * writing taslp paper | ||
|| | || | ||
* | * | ||
第220行: | 第266行: | ||
|Qi Qu | |Qi Qu | ||
|| | || | ||
− | * | + | * Finding ideal thresholds and deploying cloud services for KWS models: `zh48_guangdong` and `zh48_haining20`. |
+ | * Located and fixed a bug in FunASR which may lead to segmentation fault. Built service with extended gRPC protocol. | ||
+ | * Analysis of some AED (cries and slaps) FAs. | ||
|| | || | ||
* | * |
2024年11月25日 (一) 11:04的最后版本
People | This Week | Next Week | Task Tracking (DeadLine) |
---|---|---|---|
Dong Wang |
|
|
|
Lantian Li |
|
|
|
Ying Shi |
|
|
|
Zhenghai You |
|
|
|
Junming Yuan |
|
|
|
Chen Chen |
|
|
|
Xiaolou Li |
|
|
|
Zehua Liu |
|
|
|
Pengqi Li |
|
|
|
Wan Lin |
|
|
|
Tianhao Wang |
|
|
|
Xiaoxue Luo |
|
|
|
Zhenyu Zhou |
|
|
|
Junhui Chen |
|
|
|
Jiaying Wang |
|
|
|
Yu Zhang |
|
| |
Wenqiang Du |
|
| |
Yang Wei |
|
|
|
Lily |
|
|
|
Turi |
|
|
|
Yue Gu |
|
|
|
Qi Qu |
|
|
|