“2026-04-13”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第107行: 第107行:
 
|Yu Zhang
 
|Yu Zhang
 
||
 
||
*
+
* GPU Util: [https://z1et6d3xtb.feishu.cn/wiki/XX4NwX3tJiBDcgkMi0hcFUtInHh]
 +
* Chain level experiments:
 +
** After introducing the Metric Reward, the weights of correct edges converge faster compared to training with pure reinforcement learning alone.
 +
** The worse the situation when the Metric Reward is introduced (i.e., the lower the weights of critical edges), the more significant the difference compared to not using the Metric Reward.
 
||
 
||
 
*
 
*
第118行: 第121行:
 
|Junhui Chen
 
|Junhui Chen
 
||
 
||
*
+
* To strengthen the robustness of the conclusions, conducting additional experiments:
 +
** Introduce a new baseline (AgentPrune).
 +
** Add experiments on a new dataset (GSM8K).
 +
** Reproduce the results on other LLM base models.
 +
* Paper writing
 
||
 
||
 
*
 
*

2026年4月13日 (一) 10:32的版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
Lantian Li
  • FgW daily work
  • MLA book (3/4)
Wenqiang Du
Yang Wei
Ying Shi
  • revise my thesis
Yue Gu
  • write my Phd thesis
Lily
Pengqi Li
  • Paper Draft Completion & Revision Plan
Junming Yuan
  • Preparing the materials for attending ICASSP
  • ZH paper draft (need refine)
Yu Zhang
  • GPU Util: [1]
  • Chain level experiments:
    • After introducing the Metric Reward, the weights of correct edges converge faster compared to training with pure reinforcement learning alone.
    • The worse the situation when the Metric Reward is introduced (i.e., the lower the weights of critical edges), the more significant the difference compared to not using the Metric Reward.
Junhui Chen
  • To strengthen the robustness of the conclusions, conducting additional experiments:
    • Introduce a new baseline (AgentPrune).
    • Add experiments on a new dataset (GSM8K).
    • Reproduce the results on other LLM base models.
  • Paper writing
Jiaying Wang
Bochao Hu
Hongcheng Zhang
Weiman Sun
Ge Gao
  • reproduce spatialnet for speech separation
Shuailong Li
  • read some papers
    • USE(Sepformer and BSRNN and TDN)