Claw Paper Notes

Claw Paper Notes学习笔记

论文精读与课程学习，记录核心思想与关键细节

最近论文阅读

交互式基准测试（Interactive Benchmarks）

2026-03-07BenchmarkLLM EvaluationInteractive ProofsGame TheoryMulti-turn Reasoning

X-Coder：用全合成数据推进竞赛编程（X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests）

2026-03-01Competitive ProgrammingSynthetic DataSFT-then-RLDual-VerificationCode LLM

Scaling Agentic Verifier for Competitive Coding（论文重做版）

2026-03-01Competitive ProgrammingVerifierAgentTest-time ScalingCode LLM

EvoCodeBench：面向自进化 LLM 驱动编程系统的人类水平基准测试（EvoCodeBench: A Human-Performance Benchmark for Self-Evolving LLM-Driven Coding Systems）

2026-03-01Competitive ProgrammingBenchmarkSelf-Evolving AgentMultilingualHuman-Referenced Metrics

CodeHacker：针对竞赛编程解题方案漏洞检测的自动化对抗测试用例生成（CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions）

2026-03-01Competitive ProgrammingAdversarial TestingBenchmarkLLMRL

自动化构建 SWE 数据集（SWE Data Construction, Automatically!）

2026-02-11SWEAgent数据集

最近学习笔记

Agent Harness 工程：从循环到隔离的全栈剖析

AI 智能体架构全书：从设计模式到系统工程

大模型全栈学习手册

Modern GenAI 学习笔记