2024-08-19

来自cslt Wiki
2024年8月19日 (一) 08:47Yuanjunming讨论 | 贡献的版本

跳转至: 导航搜索
People This Week Next Week Task Tracking (DeadLine)
Dong Wang
Lantian Li
Ying Shi
Zhenghai You
Junming Yuan
  • Verified two parameters in Hubert pretraining config file that were confused with the original paper.[1]
    • Confirmed that in the second iteration of pretraining, features should be extracted from the 6-th layer of the transformer, not the 9-th layer.
      • in 175k step, result of 6-th layer: 71.55/9.39, result of 9-th layer: 37.31/16.72
    • Basically confirmed the setting of the parameter 'untie_final_proj' for the two iterations of pretraining.
Chen Chen
Xiaolou Li
Zehua Liu
Pengqi Li
Wan Lin
Tianhao Wang
Zhenyu Zhou
Junhui Chen
Jiaying Wang
Yu Zhang
Wenqiang Du
Yang Wei
Lily
Turi
Yue Gu
Qi Qu