“2025-03-03”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(18位用户的25个中间修订版本未显示)
第6行: 第6行:
 
|Dong Wang
 
|Dong Wang
 
||
 
||
*  
+
* Three slides for AIGE to gov and enterprise.
 
+
 
||
 
||
 
*
 
*
第18行: 第17行:
 
|Lantian Li
 
|Lantian Li
 
||
 
||
*  
+
* Proofread of the high-school book (Done)
 
||
 
||
 
*
 
*
第29行: 第28行:
 
|Ying Shi
 
|Ying Shi
 
||
 
||
*  
+
* Prepare Ascend Sever environment
 +
* training Conditional Chain overlap ASR model with Hierachical-Transformer [https://z1et6d3xtb.feishu.cn/docx/UPxGdBl4Zo6JapxHmsXcI14Pngg?from=from_copylink here]
 
||
 
||
 
*  
 
*  
第41行: 第41行:
 
|Zhenghai You
 
|Zhenghai You
 
||
 
||
*  
+
* Training TSE model for with content enrollment(for Huawei & CSSC(中船) projects)
 +
* Reading papers about refiner
 
||
 
||
 
*
 
*
第54行: 第55行:
 
* Double check the related experimental code.
 
* Double check the related experimental code.
 
** MT-HuBERT(in progress) & Cocktail-HuBERT need re-pretrain.
 
** MT-HuBERT(in progress) & Cocktail-HuBERT need re-pretrain.
** The results of other baseline on [https://z1et6d3xtb.feishu.cn/docx/VThUd30RPoTBR4xOiKYc4gQTnsb here]
+
** The results of other baseline in [https://z1et6d3xtb.feishu.cn/docx/VThUd30RPoTBR4xOiKYc4gQTnsb here]
 
||
 
||
 
*
 
*
第65行: 第66行:
 
|Xiaolou Li
 
|Xiaolou Li
 
||
 
||
*  
+
* VSR training (1500h) cnvsrc-single valid 300 CER: 36.14% (not converged)
 +
* Finish pre-processing 4000h data
 +
* get ASR transcript for 4000h data
 +
* Writing NSFC document
 
||
 
||
 
*  
 
*  
第76行: 第80行:
 
|Zehua Liu
 
|Zehua Liu
 
||
 
||
*
+
*Paper Reading and Sharing in last Friday
 +
*Writing Vision Language Model code
 +
*Writing NSFC document
 
||
 
||
 
*
 
*
第87行: 第93行:
 
|Pengqi Li
 
|Pengqi Li
 
||
 
||
*  
+
* Prepare the AI course for Tsinghua University Junior High School.
 +
* Using t-SNE to visualize the factorized content vector.
 +
** Next step is to color(speaker information importance or not) each point.
 
||
 
||
 
*
 
*
第98行: 第106行:
 
|Wan Lin
 
|Wan Lin
 
||
 
||
*  
+
* try some adjustment for clean performance(no improvement)
 +
* supply experiments for other tests
 
||
 
||
 
*
 
*
第109行: 第118行:
 
|Tianhao Wang
 
|Tianhao Wang
 
||
 
||
*  
+
* sound separation: 2-mix and 3-mix model training
 +
* weekly report
 
||
 
||
*
+
* subset data training
 
||
 
||
 
*
 
*
第120行: 第130行:
 
|Xiaoxue Luo
 
|Xiaoxue Luo
 
||
 
||
*  
+
* generation of multi-mix audio data and did some test experiments.
 +
* read papers
 
||
 
||
 
*
 
*
第131行: 第142行:
 
|Zhenyu Zhou
 
|Zhenyu Zhou
 
||
 
||
*
+
* finish graduation thesis
 
||
 
||
 
*
 
*
第142行: 第153行:
 
|Junhui Chen
 
|Junhui Chen
 
||
 
||
*
+
* Reproducing speaker diarization method for NS (debugging...)
 +
* read paper
 
||
 
||
 
*
 
*
第153行: 第165行:
 
|Jiaying Wang
 
|Jiaying Wang
 
||
 
||
*
+
* debug ctc loss part[https://z1et6d3xtb.feishu.cn/docx/TUHldiaoQoYBqux7JEhcaCXenzh]
 
||
 
||
 
*
 
*
第164行: 第176行:
 
|Yu Zhang
 
|Yu Zhang
 
||
 
||
*  
+
* AED:
 +
** Split AED model into two smaller model to detect the human voice in noisy environments and in clean environments separately.
 +
** Trying smaller model (under 200K)
 +
* Multi Agent Investment
 +
** try index enhancement trading, no obvious excess return
 
||
 
||
*
+
* try do portfolio investment on some selected big company
 +
* add the debate topic about the logical consistency inside investment decisions.
 
||
 
||
 
*
 
*
第175行: 第192行:
 
|Wenqiang Du
 
|Wenqiang Du
 
||
 
||
*  
+
* Primary handbook's PPT (24/44)
 +
* Continue to check  Primary and middle handbook(Completed this week)
 +
* Speech cloning sample for the company
 +
 
 
||
 
||
 
*
 
*
第186行: 第206行:
 
|Yang Wei
 
|Yang Wei
 
||
 
||
*  
+
* Tuning text enroll kws model for dialect data with linear layer. (recall: 65%->85%->94%)
 
||
 
||
 
*
 
*
第196行: 第216行:
 
|Turi
 
|Turi
 
||
 
||
*  
+
* Thesis writing
 +
* Result with LM[https://z1et6d3xtb.feishu.cn/docx/JvDsd8zR4oMwnyxQEQdckpMjn7m?from=from_copylink]
 
||
 
||
 
*  
 
*  
第204行: 第225行:
 
|Yue Gu
 
|Yue Gu
 
||
 
||
*  
+
* finish some exps, but nothing is improved.
 +
* finish a proposal,I will present it recently
 
||
 
||
 
*
 
*
第213行: 第235行:
 
|Qi Qu
 
|Qi Qu
 
||
 
||
*  
+
* Applying pre-prod eval routine on text-enroll KWS models: the ideal thresholds for each keyword vary significantly. [https://b30lttjm7l.feishu.cn/docx/BepsdxzYloNlLNxHgGncSUXVnee?from=from_copylink]
 
||
 
||
 
*
 
*

2025年3月3日 (一) 11:00的最后版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • Three slides for AIGE to gov and enterprise.
Lantian Li
  • Proofread of the high-school book (Done)
Ying Shi
  • Prepare Ascend Sever environment
  • training Conditional Chain overlap ASR model with Hierachical-Transformer here
Zhenghai You
  • Training TSE model for with content enrollment(for Huawei & CSSC(中船) projects)
  • Reading papers about refiner
Junming Yuan
  • Finish MPC-HuBERT pretrain.
  • Double check the related experimental code.
    • MT-HuBERT(in progress) & Cocktail-HuBERT need re-pretrain.
    • The results of other baseline in here
Xiaolou Li
  • VSR training (1500h) cnvsrc-single valid 300 CER: 36.14% (not converged)
  • Finish pre-processing 4000h data
  • get ASR transcript for 4000h data
  • Writing NSFC document
Zehua Liu
  • Paper Reading and Sharing in last Friday
  • Writing Vision Language Model code
  • Writing NSFC document
Pengqi Li
  • Prepare the AI course for Tsinghua University Junior High School.
  • Using t-SNE to visualize the factorized content vector.
    • Next step is to color(speaker information importance or not) each point.
Wan Lin
  • try some adjustment for clean performance(no improvement)
  • supply experiments for other tests
Tianhao Wang
  • sound separation: 2-mix and 3-mix model training
  • weekly report
  • subset data training
Xiaoxue Luo
  • generation of multi-mix audio data and did some test experiments.
  • read papers
Zhenyu Zhou
  • finish graduation thesis
Junhui Chen
  • Reproducing speaker diarization method for NS (debugging...)
  • read paper
Jiaying Wang
  • debug ctc loss part[1]
Yu Zhang
  • AED:
    • Split AED model into two smaller model to detect the human voice in noisy environments and in clean environments separately.
    • Trying smaller model (under 200K)
  • Multi Agent Investment
    • try index enhancement trading, no obvious excess return
  • try do portfolio investment on some selected big company
  • add the debate topic about the logical consistency inside investment decisions.
Wenqiang Du
  • Primary handbook's PPT (24/44)
  • Continue to check Primary and middle handbook(Completed this week)
  • Speech cloning sample for the company
Yang Wei
  • Tuning text enroll kws model for dialect data with linear layer. (recall: 65%->85%->94%)
Turi
  • Thesis writing
  • Result with LM[2]
Yue Gu
  • finish some exps, but nothing is improved.
  • finish a proposal,I will present it recently
Qi Qu
  • Applying pre-prod eval routine on text-enroll KWS models: the ideal thresholds for each keyword vary significantly. [3]