|
|
第175行: |
第175行: |
| |Junhui Chen | | |Junhui Chen |
| || | | || |
− | * | + | * Read paper (ICCIP keynote speak paper and some other) |
| + | * NS |
| + | ** Rebuttal |
| + | ** Some tests about transformer feature extractor |
| || | | || |
| * | | * |
People |
This Week |
Next Week |
Task Tracking (DeadLine)
|
Dong Wang
|
- 2nd round of check for AI handbook middle school.
- Deal with pictures in AI handbook (primary & middle).
- Start to check AI handbook high school.
- Check AI book for Tianjin medical school.
|
|
|
Lantian Li
|
- Complete my CSTR Report
- Go on AI-Graph EN Chapter 4
- Polish 2025 Daily Sign
|
|
|
Ying Shi
|
- Design cohort- conditional chain multi-talker ASR with round-RNN
- WER result : round-1 32.15% , round-2: 69.69% round-3: 92.33%
- For 500 utterances sub-test set: Only 28% of the sentences have a recognition order that matches the cosine distance.
- Prepare for Huawei's interview.
|
|
|
Zhenghai You
|
- Huawei TSE(Train models that better fit the scene)[1]
|
|
|
Junming Yuan
|
- Comparable results between Clean-HuBERT, Cocktail-HuBERT, and MT-HuBERT[2]
- Bad news: Cocktail-HuBERT > Clean-HuBERT > MT-HuBERT
|
|
|
Chen Chen
|
|
|
|
Xiaolou Li
|
- Data process
- CVS3 1/4 already cut from original video, waiting for pre-process
- Copying pre-processed GongAn video data from gonganbu
- VSR Contrastive Loss Exp
- Inspired by paper [3]
- Main idea: For better align visual feature to LLM input, calculate cos similarity of target and video feature, set the biggest as the positive pair.
- Result: Under training
- Paper Reading
|
|
|
Zehua Liu
|
- Rebutall writing
- Iterative training and inference
- Iter-1(45.53%) < Iter-2(45.00%) < Iter-3(44.85%)
|
|
|
Pengqi Li
|
- Begin writing paper about importance of phonemes analysis work.
- Reading a doctoral thesis about speaker explainability[4].
|
|
|
Wan Lin
|
- NS: all transformer
- 6k spk: EER 2.6%
- 20k spk: EER 2.3%
- 20k spk+multi-enroll: EER 1.9%
|
|
|
Tianhao Wang
|
- Experiments about query embedding conditional approach:
- SDR: FiLM (7.492) > self-attention (6.573)
|
|
|
Xiaoxue Luo
|
- training of the USS(CED+AudioSep) model
- adjust the audio format to meet the needs of the model(in training)
- production of 2025 Daily Sign( March )
|
|
|
Zhenyu Zhou
|
- Speaker identity based conditional chain proposal[5]
- prepare Interim Report
|
|
|
Junhui Chen
|
- Read paper (ICCIP keynote speak paper and some other)
- NS
- Rebuttal
- Some tests about transformer feature extractor
|
|
|
Jiaying Wang
|
|
|
|
Yu Zhang
|
- Huawei AED
- data aug & human annotated dataset [6]
- Finance
- Paper reading, reproduce local Llama version of StockAgent [7] (a LLM based market simulation framework)
|
|
|
Wenqiang Du
|
- Training of New language Models(HeNan)[8]
- Training of New language Models(ChongQing)[9]
|
|
|
Yang Wei
|
- Fix some bugs about keyword sampling in text enroll kws training code.
- Add spec augmentation for text enroll kws training.
|
|
|
Lily
|
|
|
|
Turi
|
- Paper reading
- ICASSP 2025 rebuttal
|
|
|
Yue Gu
|
- Synthesis about 1h data for each target speaker, then using these data to train the adapter module.[10]
- writing taslp paper
|
|
|
Qi Qu
|
- Finding ideal thresholds and deploying cloud services for KWS models: `zh48_guangdong` and `zh48_haining20`.
- Located and fixed a bug in FunASR which may lead to segmentation fault. Built service with extended gRPC protocol.
- Analysis of some AED (cries and slaps) FAs.
|
|
|