“2024-02-05”版本间的差异

2024年2月7日 (三) 05:23的最后版本

People	This Week	Next Week	Task Tracking (DeadLine)
Dong Wang	Keep on NeuralMag paper, refine the complexity theory Design AI course for Primary School.
Lantian Li
Ying Shi	INTERSPEECH Paper: Keyword attributed Overlapping ASR SOTA model training (down) SOT model training (down) test (in progress) Cohort Overlapping ASR one fix cohort: 2-mix recognizes ONE WER 8.90% one fix cohort: 2-mix recognizes TOW WER 9.30% one fix cohort: 3-mix recognize THREE WER 37.83% apply number speaker prior WER 30.58%
Zhenghai You
Junming Yuan
Chen Chen	DeepFake by xiaolou,zehua syncnet and wer based experiments on noisy audio/video input seems noise is not the reason why these methods failed VTS Finetune a HuBERT with a HiFiGAN for "audio feature to speech" system (both single speaker and multi speaker is ok) Train a VTS(ResNet Conformer Encoder) for "Video to audio feature" system (for single speaker it works well to some degree) Try training multi-speaker video-to-audio-feature system Try joint train video encoder and hifigan
Xiaolou Li
Zehua Liu
Pengqi Li
Wan Lin	Summarize possible architectures Coding & practice
Tianhao Wang
Zhenyu Zhou
Junhui Chen
Jiaying Wang
Yu Zhang
Wenqiang Du	Aibabel update CN KWS model Diting Supplementary testing write test report
Yang Wei	Prepare data backup for corpus disk.
Lily

@@ 第29行： / 第29行： @@
 |Ying Shi
 ||
-*
+*    INTERSPEECH Paper: Keyword attributed Overlapping ASR
+:        SOTA model training (down)
+:        SOT model training (down)
+:        test (in progress)
+*    Cohort Overlapping ASR
+:        one fix cohort: 2-mix recognizes ONE WER 8.90%
+:        one fix cohort: 2-mix recognizes TOW WER 9.30%
+:        one fix cohort: 3-mix recognize THREE WER 37.83% apply number speaker prior WER 30.58%
 ||
 *
@@ 第61行： / 第70行： @@
 |Chen Chen
 ||
-*
+*    DeepFake
+:        by xiaolou,zehua
+:        syncnet and wer based experiments on noisy audio/video input
+:        seems noise is not the reason why these methods failed
+*    VTS
+:        Finetune a HuBERT with a HiFiGAN for "audio feature to speech" system (both single speaker and multi speaker is ok)
+:        Train a VTS(ResNet Conformer Encoder) for "Video to audio feature" system (for single speaker it works well to some degree)
+:        Try training multi-speaker video-to-audio-feature system
+:        Try joint train video encoder and hifigan
 ||
 *
@@ 第105行： / 第125行： @@
 |Wan Lin
 ||
-*
+* Summarize possible architectures
+* Coding & practice
 ||
 *
@@ 第171行： / 第192行： @@
 |Wenqiang Du
 ||
-*
+* Aibabel
+:update CN KWS model
+* Diting
+:Supplementary testing
+:write test report
 ||
 *
@@ 第182行： / 第207行： @@
 |Yang Wei
 ||
-*
+* Prepare data backup for corpus disk.
 ||
 *

“2024-02-05”版本间的差异

2024年2月7日 (三) 05:23的最后版本

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具