|
|
| (3位用户的3个中间修订版本未显示) |
| 第43行: |
第43行: |
| | |Yang Wei | | |Yang Wei |
| | || | | || |
| − | * | + | * Test for audio visual speech separation + ASR (offline script), with 200 2mix speech video, CER: 20%. |
| | + | * Chinese mispronunciation detection experiment, with Chinese Hubert as feature. (precision: 0.18, recall: 0.72) |
| | || | | || |
| | * | | * |
| 第78行: |
第79行: |
| | |Lily | | |Lily |
| | || | | || |
| − | * | + | * Check English version of middle-scholl handbook |
| | + | * Check English version of high-scholl handbook |
| | + | * Organized course materials production (小初高分册) |
| | || | | || |
| | * | | * |
| 第102行: |
第105行: |
| | |Junming Yuan | | |Junming Yuan |
| | || | | || |
| − | * | + | * Aug-MT-HuBERT: |
| | + | ** Based on last week’s best configuration, Continued pre-training for 300K steps. |
| | + | ***No improvement was observed on clean-speech tasks. |
| | + | ** After inspection, found that the pre-training inherited a low lr from the previous model after 1.6M steps. |
| | + | ***After increase the lr and retraining for 200K steps, there was still no improvement on clean-speech tasks. |
| | + | ***200K steps, PR(PER): 8.14, ASR(WER): 8.93 |
| | + | * SS Adaptation: |
| | + | ** following the SA-WavLM strategy, further evaluated MT-HuBERT under low-resource settings (10% and 1% of the training data). |
| | + | *** 10% data: Cocktail(11.29) > MT-HuBERT(11.07) > WavLM(10.81) |
| | + | *** 1% data: Cocktail(8.56) > MT-HuBERT(8.43) > WavLM(8.11) |
| | + | * Draft Paper writing(EN version almost done) |
| | || | | || |
| | * | | * |
| 第187行: |
第200行: |
| | |Bochao Hu | | |Bochao Hu |
| | || | | || |
| − | * | + | * finish final exam |
| | + | * debug and train vsr E2E model, still in training |
| | + | * resently results [https://z1et6d3xtb.feishu.cn/wiki/MQS9wt7gYikboBkJ09ScovGxnmg?from=from_copylink] |
| | || | | || |
| | * | | * |
| People |
This Week |
Next Week |
Task Tracking (DeadLine)
|
| Dong Wang
|
- Double check English version of high-scholl handbook.
- Check English version of middle-school handbook.
|
|
|
| Lantian Li
|
- Final review of my MLA book (6/10)
- MoE daily work
|
|
|
| Wenqiang Du
|
- Complete recording high school AI courses(14/14)
- Creat PPT for high school AI course (3/14)
|
|
|
| Yang Wei
|
- Test for audio visual speech separation + ASR (offline script), with 200 2mix speech video, CER: 20%.
- Chinese mispronunciation detection experiment, with Chinese Hubert as feature. (precision: 0.18, recall: 0.72)
|
|
|
| Ying Shi
|
- Thesis
- Some stuff about HUAWEI project
|
|
|
| Yue Gu
|
- Revise the Phd thesis structure
- Seminar. In an anonymous vote of 20 people, 75% chose to freely discuss.
|
|
|
| Lily
|
- Check English version of middle-scholl handbook
- Check English version of high-scholl handbook
- Organized course materials production (小初高分册)
|
|
|
| Pengqi Li
|
- Drafting the method part of paper.
- Identified bugs in the reproduction code. Re-ran experiments and confirmed that conclusions remain consistent.
- Assisting with the revision of the middle school handbook.
|
|
|
| Junming Yuan
|
- Aug-MT-HuBERT:
- Based on last week’s best configuration, Continued pre-training for 300K steps.
- No improvement was observed on clean-speech tasks.
- After inspection, found that the pre-training inherited a low lr from the previous model after 1.6M steps.
- After increase the lr and retraining for 200K steps, there was still no improvement on clean-speech tasks.
- 200K steps, PR(PER): 8.14, ASR(WER): 8.93
- SS Adaptation:
- following the SA-WavLM strategy, further evaluated MT-HuBERT under low-resource settings (10% and 1% of the training data).
- 10% data: Cocktail(11.29) > MT-HuBERT(11.07) > WavLM(10.81)
- 1% data: Cocktail(8.56) > MT-HuBERT(8.43) > WavLM(8.11)
- Draft Paper writing(EN version almost done)
|
|
|
| Yu Zhang
|
- GPU Util: [1]
- Finish final exam
- Writing code to analyze LLM Swarm metrics, mainly looking at how ECS/PKS correlate with the optimized edge probability.
|
|
|
| Junhui Chen
|
- finish final exam
- debug code about swarm minicrossword test
|
|
|
| Xiaolou Li
|
|
|
|
| Jiaying Wang
|
- spk model training(130/300):reconstruct data, now training data is aligned with separation data
- recall@k of 130 epoch: 2mix recall@2=0.9799, 3mix recall@3=0.9101, 4mix recall@4=0.8247
|
|
|
| Tianhao Wang
|
|
|
|
| Xiaoxue Luo
|
- 2-5mix multi_head separation model for Huawei project
- write code for the multi-speaker and multi-sound events separation task,complete data preparation and feature extraction
- adjust the model structure to three heads(speech, music and others), the model is still in training, and current val_sisdr is 11.38
|
|
|
| Bochao Hu
|
- finish final exam
- debug and train vsr E2E model, still in training
- resently results [2]
|
|
|
| Hongcheng Zhang
|
- finish final exam
- debug for asu-llm code and train the task audio caption with WavCaps(1/20 epoches)
|
|
|