附录C 参考文献与延伸阅读
本附录汇总全书引用的核心文献、知识来源与推荐延伸阅读材料。
一、本书核心知识来源
1. Hello-Agents
- 项目地址:GitHub开源项目
- 内容:中文社区最全面的AI Agent系统教程,16章内容覆盖从智能体历史到前沿应用
- 引用章节:第1—6、8、10—13、19—21、24—26、29章
2. Agentic Design Patterns
- 作者:Antonio Gulli
- 出版:2025年,424页
- 内容:21种Agentic设计模式的系统分类与实现指南
- 引用章节:第2—3、5—10、15、19—24、25、27、28、30章
- 在线资源:Amazon预印本
3. Practical Guide to Context Engineering
- 项目地址:GitHub开源项目
- 内容:上下文工程实践指南,覆盖RAG技术、上下文管理、会话存储等工程模块
- 引用章节:第2、8、13—18、23—24、29—30章
4. Context Engineering 101 (ce101)
- 内容:从Prompt Engineering到Context Engineering的范式演进论述
- 引用章节:第1、3—4、6、11、16、19章
5. docs合集
- 自我进化智能体论文:arXiv:2512.13564v2
- Agentic模型和记忆机制:2025年11月研究分享
- AI记忆机制Survey:Survey on AI Memory
- 引用章节:第4、11—12、28章
二、经典论文
LLM 基础
- Vaswani, A., et al. (2017). "Attention Is All You Need." NeurIPS 2017. [arXiv:1706.03762] Transformer架构的奠基论文。
- Kaplan, J., et al. (2020). "Scaling Laws for Neural Language Models." arXiv preprint. [arXiv:2001.08361] 神经语言模型的缩放定律。
- Wei, J., et al. (2022). "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." NeurIPS 2022. [arXiv:2201.11903] 思维链提示的开创性工作。
- Hoffmann, J., et al. (2022). "Training Compute-Optimal Large Language Models." NeurIPS 2022. [arXiv:2203.15556] Chinchilla缩放法则,计算最优训练。
- Ouyang, L., et al. (2022). "Training Language Models to Follow Instructions with Human Feedback." NeurIPS 2022. [arXiv:2203.02155] InstructGPT,RLHF对齐范式。
- Brown, T.B., et al. (2020). "Language Models are Few-Shot Learners." NeurIPS 2020. [arXiv:2005.14165] GPT-3,大模型少样本学习能力。
智能体基础
- Yao, S., et al. (2023). "ReAct: Synergizing Reasoning and Acting in Language Models." ICLR 2023. [arXiv:2210.03629] 提出ReAct范式,统一推理与行动。
- Shinn, N., et al. (2023). "Reflexion: Language Agents with Verbal Reinforcement Learning." NeurIPS 2023. [arXiv:2303.11366] 语言反馈强化的自省智能体。
- Yao, S., et al. (2023). "Tree of Thoughts: Deliberate Problem Solving with Large Language Models." NeurIPS 2023. [arXiv:2305.10601] 思维树推理框架。
- Zhou, A., et al. (2023). "Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models." arXiv preprint. [arXiv:2310.04406] LATS,统一推理、行动与规划。
- Khattab, O., et al. (2023). "DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines." arXiv preprint. [arXiv:2310.03714] LLM流水线自动优化编译框架。
- Wang, L., et al. (2024). "A Survey on Large Language Model based Autonomous Agents." Frontiers of CS. [arXiv:2308.11432] 大模型自主智能体综述。
- Xi, Z., et al. (2023). "The Rise and Potential of Large Language Model Based Agents: A Survey." arXiv preprint. [arXiv:2309.07864] LLM Agent崛起与潜力综述。
- Wang, L., et al. (2023). "Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models." ACL 2023. [arXiv:2305.04091] 计划-求解提示改进零样本推理。
规划与推理
- Shen, Y., et al. (2023). "HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face." NeurIPS 2023. [arXiv:2303.17580] LLM作为任务调度控制器。
- Liu, B., et al. (2023). "LLM+P: Empowering Large Language Models with Optimal Planning Proficiency." arXiv preprint. [arXiv:2304.11477] 将经典规划器融入LLM。
- Besta, M., et al. (2023). "Graph of Thoughts: Solving Elaborate Problems with Large Language Models." arXiv preprint. [arXiv:2308.09687] 思维图推理框架。
- Sumers, T.R., et al. (2023). "Cognitive Architectures for Language Agents." TMLR 2024. [arXiv:2309.02427] 语言智能体认知架构CoALA。
核心系统
- Park, J.S., et al. (2023). "Generative Agents: Interactive Simulacra of Human Behavior." UIST 2023. [arXiv:2304.03442] 生成式智能体的记忆与行为模拟。
- Wang, G., et al. (2023). "Voyager: An Open-Ended Embodied Agent with Large Language Models." NeurIPS 2023. [arXiv:2305.16291] Minecraft中的终身学习智能体。
- Yang, J., et al. (2024). "SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering." NeurIPS 2024. [arXiv:2405.15793] 软件工程Agent-计算机接口设计。
- Hong, S., et al. (2023). "MetaGPT: Meta Programming for Multi-Agent Collaborative Framework." ICLR 2024. [arXiv:2308.00352] 多Agent协作的元编程框架。
- Li, G., et al. (2023). "CAMEL: Communicative Agents for 'Mind' Exploration of Large Language Model Society." NeurIPS 2023. [arXiv:2303.17760] 多Agent角色扮演通信框架。
- Hu, S., et al. (2024). "Automated Design of Agentic Systems." ICLR 2025. [arXiv:2408.08435] ADAS,自动化设计Agent系统。
- Qian, C., et al. (2023). "ChatDev: Communicative Agents for Software Development." ACL 2024. [arXiv:2307.07924] 多Agent驱动的虚拟软件公司。
- Wang, X., et al. (2024). "OpenHands: An Open Platform for AI Software Developers as Generalist Agents." ICLR 2025. [arXiv:2407.16741] 开源AI软件开发Agent平台。
- Qiao, B., et al. (2023). "TaskWeaver: A Code-First Agent Framework." arXiv preprint. [arXiv:2311.17541] 微软代码优先的Agent框架。
- arXiv:2512.13564. "Self-Evolving Agents." 自我进化智能体的理论框架与生命周期模型。
Agent 训练
- Schick, T., et al. (2023). "Toolformer: Language Models Can Teach Themselves to Use Tools." NeurIPS 2023. [arXiv:2302.04761] 自监督学习工具使用能力。
- Chen, Z., et al. (2024). "Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models." ACL 2024 Findings. [arXiv:2403.12881] Agent能力高效微调数据设计。
- Zeng, A., et al. (2023). "AgentTuning: Enabling Generalized Agent Abilities for LLMs." ICLR 2025. [arXiv:2310.12823] 混合指令微调激活Agent泛化能力。
- Chen, B., et al. (2023). "FireAct: Toward Language Agent Fine-tuning." arXiv preprint. [arXiv:2310.05915] 多任务多方法Agent轨迹微调。
- Wang, Z., et al. (2024). "Agent Workflow Memory." arXiv preprint. [arXiv:2409.07429] 从经验中归纳可复用工作流。
- Patil, S.G., et al. (2023). "Gorilla: Large Language Model Connected with Massive APIs." arXiv preprint. [arXiv:2305.15334] 检索感知训练提升API调用准确性。
- Qin, Y., et al. (2023). "ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs." ICLR 2024. [arXiv:2307.16789] 大规模API工具使用框架与数据集。
Agent 评估
- Jimenez, C.E., et al. (2024). "SWE-bench: Can Language Models Resolve Real-World GitHub Issues?" ICLR 2024. [arXiv:2310.06770] 真实GitHub Issue修复基准。
- Liu, X., et al. (2023). "AgentBench: Evaluating LLMs as Agents." ICLR 2024. [arXiv:2308.03688] 8环境多维Agent能力评估。
- Mialon, G., et al. (2023). "GAIA: A Benchmark for General AI Assistants." ICLR 2024. [arXiv:2311.12983] 通用AI助手综合基准。
- Zhou, S., et al. (2023). "WebArena: A Realistic Web Environment for Building Autonomous Agents." ICLR 2024. [arXiv:2307.13854] 真实网页环境Agent评估。
- Xie, T., et al. (2024). "OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments." NeurIPS 2024. [arXiv:2404.07972] 跨OS多模态Agent基准。
- Yao, S., et al. (2024). "tau-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains." arXiv preprint. [arXiv:2406.12045] 工具-Agent-用户三方交互基准。
- Xu, Q., et al. (2023). "ToolBench: On the Tool Manipulation Capability of Open-source Large Language Models." EMNLP 2024. [arXiv:2305.16504] 开源LLM工具操作评估。
- Wang, X., et al. (2023). "MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback." ICLR 2024. [arXiv:2309.10691] 多轮工具交互与反馈评估。
- Ma, C., et al. (2024). "AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents." NeurIPS 2024 Oral. [arXiv:2401.13178] 多维度Agent分析评估板。
记忆系统
- Packer, C., et al. (2023). "MemGPT: Towards LLMs as Operating Systems." arXiv preprint. [arXiv:2310.08560] 虚拟内存管理实现无限上下文。
- Zhang, Z., et al. (2024). "A Survey on the Memory Mechanism of Large Language Model based Agents." arXiv preprint. 智能体记忆机制综述。
- Sumers, T.R., et al. (2023). "Cognitive Architectures for Language Agents." (参见"规划与推理"分类) 提出CoALA认知架构含记忆模块。
- Park, J.S., et al. (2023). "Generative Agents." (参见"核心系统"分类) 首创观察-反思-规划三层记忆架构。
检索增强生成 (RAG)
- Lewis, P., et al. (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." NeurIPS 2020. [arXiv:2005.11401] RAG的奠基论文。
- Edge, D., et al. (2024). "From Local to Global: A Graph RAG Approach to Query-Focused Summarization." arXiv preprint. [arXiv:2404.16130] 图结构RAG,支持全局查询摘要。
- Sarthi, P., et al. (2024). "RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval." ICLR 2024. [arXiv:2401.18059] 递归摘要树组织的层次化检索。
- Asai, A., et al. (2023). "Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection." ICLR 2024. [arXiv:2310.11511] 自适应检索与自反思生成。
- Muennighoff, N., et al. (2023). "MTEB: Massive Text Embedding Benchmark." EACL 2023. [arXiv:2210.07316] 大规模文本嵌入基准。
- Gao, Y., et al. (2024). "Retrieval-Augmented Generation for Large Language Models: A Survey." arXiv preprint. [arXiv:2312.10997] RAG技术综述。
- Shi, W., et al. (2023). "REPLUG: Retrieval-Augmented Black-Box Language Models." NAACL 2024. [arXiv:2301.12652] 黑盒LLM的即插即用检索增强。
上下文工程
- Hsieh, C.P., et al. (2024). "RULER: What's the Real Context Size of Your Long-Context Language Models?" COLM 2024. [arXiv:2404.06654] 长上下文模型真实能力评估。
- Jiang, H., et al. (2023). "LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models." EMNLP 2023. [arXiv:2310.05736] 提示压缩加速LLM推理。
- Liu, N.F., et al. (2023). "Lost in the Middle: How Language Models Use Long Contexts." TACL 2024. [arXiv:2307.03172] 长上下文中间位置信息利用不足。
- Kamradt, G. (2023). "Needle in a Haystack: Pressure Testing LLMs." 长文本检索能力的标志性测试。
- Anthropic. (2024). "Contextual Retrieval." Anthropic Blog. 上下文嵌入+BM25,降低49%检索失败。
- Anthropic. (2024). "Building effective agents." Anthropic官方Agent构建指南。
- OpenAI. (2024). "A practical guide to building agents." OpenAI官方Agent实践指南。
安全与对齐
- Greshake, K., et al. (2023). "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection." AISec 2023. [arXiv:2302.12173] 间接提示注入攻击的系统性分析。
- Bai, Y., et al. (2022). "Constitutional AI: Harmlessness from AI Feedback." arXiv preprint. [arXiv:2212.08073] 宪法AI,基于AI反馈的对齐方法。
- Perez, E., et al. (2022). "Red Teaming Language Models with Language Models." EMNLP 2022. [arXiv:2202.03286] 用语言模型自动化红队测试。
- Anil, C., et al. (2024). "Many-shot Jailbreaking." NeurIPS 2024. 长上下文多样本越狱攻击。
- OWASP. (2025). "OWASP Top 10 for LLM Applications." owasp.org/www-project-top-10-for-large-language-model-applications LLM应用十大安全风险。
- Zhan, Q., et al. (2024). "InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated LLM Agents." ACL 2024 Findings. [arXiv:2403.02691] 工具集成Agent间接注入评测。
多智能体系统
- Wu, Q., et al. (2023). "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation." COLM 2024. [arXiv:2308.08155] 多Agent对话框架。
- Du, Y., et al. (2023). "Improving Factuality and Reasoning in Language Models through Multiagent Debate." ICML 2024. [arXiv:2305.14325] 多Agent辩论提升事实性和推理。
- Chen, W., et al. (2023). "AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors." ICLR 2024. [arXiv:2308.10848] 多Agent协作与涌现行为平台。
- Qian, C., et al. (2023). "ChatDev." (参见"核心系统"分类) 多Agent软件开发。
- Li, G., et al. (2023). "CAMEL." (参见"核心系统"分类) Agent社会角色扮演通信。
Embodied & 多模态 Agent
- Driess, D., et al. (2023). "PaLM-E: An Embodied Multimodal Language Model." ICML 2023. [arXiv:2303.03378] 562B参数的多模态具身语言模型。
- Brohan, A., et al. (2023). "RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control." CoRL 2023. [arXiv:2307.15818] 视觉-语言-动作模型转移至机器人控制。
- Hong, W., et al. (2023). "CogAgent: A Visual Language Model for GUI Agents." CVPR 2024. [arXiv:2312.08914] 18B参数GUI视觉理解模型。
- Cheng, K., et al. (2024). "SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents." ACL 2024. [arXiv:2401.10935] GUI视觉定位的纯截图Agent。
- Reed, S., et al. (2022). "A Generalist Agent." TMLR 2022. [arXiv:2205.06175] Gato,单一多模态通用Agent模型。
编码 Agent
- Yang, J., et al. (2024). "SWE-agent." (参见"核心系统"分类) 自动软件工程Agent-计算机接口。
- Jimenez, C.E., et al. (2024). "SWE-bench." (参见"Agent评估"分类) 编码Agent评估基准。
- Wang, X., et al. (2024). "OpenHands." (参见"核心系统"分类) 开源AI开发者Agent平台。
- Cognition Labs. (2024). "Devin: The First AI Software Engineer." Blog post. 首个全自主AI软件工程师。
- Qiao, B., et al. (2023). "TaskWeaver." (参见"核心系统"分类) 代码优先Agent框架。
GUI Agent
- Hong, W., et al. (2023). "CogAgent." (参见"Embodied & 多模态"分类) GUI视觉语言模型。
- Zhou, S., et al. (2023). "WebArena." (参见"Agent评估"分类) 真实网页环境Agent基准。
- Xie, T., et al. (2024). "OSWorld." (参见"Agent评估"分类) 跨操作系统Agent基准。
- Cheng, K., et al. (2024). "SeeClick." (参见"Embodied & 多模态"分类) GUI定位预训练。
Agent 蒸馏与小模型
- AgentDistill. (2025). "AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes." [arXiv:2506.14728] 免训练MCP Box知识蒸馏。
- Wang, Z., et al. (2025). "Distilling LLM Agent into Small Models with Retrieval and Code Tools." [arXiv:2505.17612] Agent能力蒸馏到小模型。
- AgentTrek. (2024). "AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials." [arXiv:2412.09605] 利用网络教程合成Agent训练轨迹。
三、协议规范
| 协议 | 发起方 | 规范地址 | 用途 |
|---|---|---|---|
| MCP | Anthropic | https://modelcontextprotocol.io | 模型-工具标准化交互 |
| MCP Spec | Anthropic | https://spec.modelcontextprotocol.io | MCP正式协议规范 |
| A2A | https://google.github.io/A2A | 智能体间发现与协作 | |
| ANP | 社区驱动 | 开放标准 | 去中心化Agent网络 |
四、框架官方文档
| 框架 | 官方文档 |
|---|---|
| LangChain | https://python.langchain.com/docs/ |
| LangGraph | https://langchain-ai.github.io/langgraph/ |
| LlamaIndex | https://docs.llamaindex.ai/ |
| CrewAI | https://docs.crewai.com/ |
| AutoGen | https://microsoft.github.io/autogen/ |
| Semantic Kernel | https://learn.microsoft.com/semantic-kernel/ |
| Dify | https://docs.dify.ai/ |
| OpenAI Agents SDK | https://openai.github.io/openai-agents-python/ |
| Google ADK | https://google.github.io/adk-docs/ |
| Strands Agents (AWS) | https://strandsagents.com/ |
五、延伸阅读推荐
书籍
- Gulli, A. (2025). Agentic Design Patterns: A Hands-On Guide to Building Intelligent Systems. 系统化的设计模式参考。
- Russell, S. & Norvig, P. (2020). Artificial Intelligence: A Modern Approach (4th ed.). AI经典教材,智能体理论基础。
- Bostrom, N. (2014). Superintelligence: Paths, Dangers, Strategies. AI安全与对齐经典著作。
技术博客与报告
- Anthropic Research Blog. 持续更新的Claude与Agent研究进展。
- Google DeepMind Blog. Gemini与Agent系统研究。
- Chip Huyen. Building LLM Applications for Production. 生产级LLM应用工程实践。
- Simon Willison's Weblog. LLM工具使用与Agent工程深度博客。
- Lilian Weng. "LLM Powered Autonomous Agents." (2023) Agent系统架构经典博文。
- Andrej Karpathy. "The Unreasonable Effectiveness of Recurrent Neural Networks" 以及后续LLM相关博文。
在线课程
- Andrew Ng. AI Agentic Design Patterns with AutoGen. DeepLearning.AI课程。
- Andrew Ng. Building Agentic RAG with LlamaIndex. DeepLearning.AI课程。
- LangChain Academy. LangGraph官方教程系列。
- Hugging Face. Building AI Agents. 短课程系列。
社区资源
- Awesome Agents (GitHub). 开源AI Agent项目精选列表。
- r/LocalLLaMA (Reddit). 本地LLM与Agent社区讨论。
- Awesome-LLM-Agent (GitHub). LLM Agent论文与项目集合。
- SWE-bench Leaderboard. 编码Agent性能排行榜。