文章预览
LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人 1、[LG] Learning How Hard to Think:Input-Adaptive Allocation of LM Computation 2、[LG] The Optimization Landscape of SGD Across the Feature Learning Strength 3、[LG] What Matters for Model Merging at Scale? 4、[CL] Steering Large Language Models between Code Execution and Textual Reasoning 5、[CL] Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes 摘要:语言模型计算的输入自适应分配、跨特征学习强度的SGD优化景观、大规模模型合并的关键、引导大型语言模型在代码执行和文本推理之间进行选择、通过重参数化初始化大型语言模型以减轻损失峰值 1、[LG] Learning How Hard to Think: Input-Adaptive Allocation of LM Computation M Damani, I Shenfeld, A Peng, A Bobu… [MIT] 学习如何努力思考:语言模型计算的输入自适应分配 要点: 输入自适应计算分配: 本文
………………………………