DeepSeek thinking mode 报 reasoning_content 必须回传？Hermes Agent 工具调用与 fallback 排错指南

使用 Hermes Agent 接入 DeepSeek thinking 模型时，一个很典型的 HTTP 400 报错是：

The 'reasoning_content' in the thinking mode must be passed back to the API.

或者在更新的接口形态里看到：

The content[].thinking in the thinking mode must be passed back to the API.

这类错误非常容易被误判成“DeepSeek 模型不可用”“API Key 错了”或“上下文太长”。但从 Hermes Agent 的相关修复讨论看，它真正指向的是一个更细的兼容问题：thinking 模型返回的 reasoning / thinking 字段，在后续请求重放历史消息时必须被保留或补齐。

本文基于 NousResearch/hermes-agent 中一个已关闭并标记 completed 的真实 DeepSeek 问题，整理排查思路。重点是帮你分清：什么时候该升级 Hermes，什么时候该检查 provider detection，什么时候问题出在 fallback / tool-call replay，而不是盲目换 key 或换模型。

先说结论：thinking mode 的历史消息不是普通 assistant message

普通 OpenAI-compatible chat 里，assistant 历史消息通常只需要：

{
  "role": "assistant",
  "content": "..."
}

但 DeepSeek thinking 模型会涉及额外的推理字段，例如 reasoning_content 或 content[].thinking。如果服务端要求这些字段在后续请求中被回传，而客户端在保存/重放历史消息时把它们丢掉，就会出现 400。

所以这个错误的本质不是“不会回答”，而是：

> 客户端没有按照 thinking mode 的会话协议，完整回放上一轮 assistant 的推理字段。

在 agent 系统里，这个问题比普通聊天软件更复杂，因为 agent 会有：

工具调用消息；
历史消息 replay；
上下文压缩；
fallback 模型；
cron / background task；
Anthropic adapter 或其它格式转换；
自定义 base_url / provider alias。

任何一条路径漏掉 reasoning 字段，都可能在某次调用中炸成 HTTP 400。

典型报错场景

问题报告里的典型配置是 DeepSeek provider + thinking-enabled model，例如：

provider: deepseek
model: deepseek-v4-flash

发送普通消息后，Hermes 报：

API call failed (attempt 1/3): BadRequestError [HTTP 400]
Provider: deepseek Model: deepseek-v4-flash
Error: HTTP 400: The 'reasoning_content' in the thinking mode must be passed back to the API.

后续评论里还提到几个更难排查的场景：

DeepSeek 作为 fallback model 时才触发；
无工具调用时正常，一旦 tool use 就触发；
自定义端点模型名类似 external-deepseek-v4-pro，不是内置 provider 名；
context compression 或 cron path 也可能涉及历史消息重放。

这说明修复不能只覆盖“主聊天 + 内置 DeepSeek provider”，还要覆盖所有会重放 assistant history 的路径。

官方修复思路：不是简单传一次，而是所有 replay 路径都要兜底

维护者在关闭问题时给出的修复说明很清楚：DeepSeek reasoning_content thinking-mode replay path 需要端到端覆盖。

关键点包括：

1. 用 _needs_deepseek_tool_reasoning() 做 provider detection； 2. 检测范围不只 provider == "deepseek"； 3. 模型名包含 deepseek 也应识别； 4. api.deepseek.com base URL 也应识别； 5. DeepSeek tool-call turns 在创建时固定 reasoning_content=""； 6. replay 时如果没有显式 reasoning，也要为空字符串补齐； 7. chat completions 和 Anthropic adapter 两个 callsite 都使用同一个 helper； 8. cron path 走 AIAgent.run_conversation()，理论上应复用同一路径。

这套修复的重点是“统一 replay helper”，不是在某个报错处临时 patch 一下。

为什么空字符串也有意义？

很多人看到 reasoning_content="" 会困惑：既然没有推理内容，为什么还要传空字符串？

原因是对某些 thinking-mode API 来说，字段是否存在本身就是协议的一部分。

也就是说：

{
  "role": "assistant",
  "content": "结果"
}

和：

{
  "role": "assistant",
  "content": "结果",
  "reasoning_content": ""
}

可能不是等价的。

如果服务端要求“thinking mode 的 assistant 历史消息必须带回 thinking 字段”，那即使这一轮没有显式推理文本，也要提供空字段，避免服务端认为你丢失了协议状态。

排查清单：遇到 reasoning_content 400 先看这 7 项

1. 确认 Hermes 版本和 commit

如果你使用的是旧版 Hermes，优先升级到包含 DeepSeek reasoning replay 修复的版本。

官方讨论里提到的关键修复包括：

93a2d6b3：为 DeepSeek tool-call messages 添加 reasoning_content echo；
d58b305ad：通过 helper 做 DeepSeek provider detection；
后续 replay pad / ordering / cross-provider isolation 相关提交。

如果你仍然复现，提交 issue 时应附上：

git rev-parse HEAD

以及模型名、provider、base_url、错误日志。

2. 看是否使用 DeepSeek 作为 fallback

评论里有人反馈：DeepSeek 作为 primary model 时不复现，作为 fallback model 时复现。

这意味着你要检查 fallback 激活后：

provider 是否变成 DeepSeek；
model 名是否包含 DeepSeek；
base_url 是否指向 DeepSeek 或 DeepSeek-compatible endpoint；
replay helper 是否在 fallback 路径也启用。

很多 agent bug 就藏在“主路径修了，fallback 路径没修”。

3. 看是否发生在工具调用之后

有用户反馈：不用工具时正常，一用工具就报：

The content[].thinking in the thinking mode must be passed back to the API.

这是很重要的线索。工具调用会让消息历史里出现更复杂的结构：

assistant tool-call message；
tool result message；
assistant final response；
replay 时的格式转换。

如果只给普通 text-only assistant message 补了 reasoning 字段，却漏了 tool-call assistant message，仍然会报错。

4. 看自定义端点的模型名

如果你使用的是自定义中转或私有端点，模型名可能不是标准的：

deepseek-v4-pro

而是：

external-deepseek-v4-pro
my-company/deepseek-v4

这时 provider detection 不能只看 provider == deepseek，还要看 model 名和 base_url。

如果 detection 没识别出来，Hermes 可能不会启用 DeepSeek reasoning replay 兜底。

5. 看错误字段是 reasoning_content 还是 content[].thinking

不同 DeepSeek API 版本或兼容层可能返回不同字段名：

reasoning_content
content[].thinking

它们都指向同一类问题：thinking-mode 历史消息 replay 不完整。

不要因为字段名不同就误判成两个完全无关的问题。

6. 看是否经过上下文压缩

如果报错前出现 context compression、summary、memory flush 或 cron background task，说明历史消息可能被重新整理过。

这时要确认压缩后的 assistant messages 是否仍保留/补齐 thinking 字段。否则主聊天正常，压缩后下一轮就可能失败。

7. 看日志里的完整 provider / model / base_url

排查时不要只贴最后一句 400。至少要确认：

provider = ?
model = ?
base_url = ?
是否 fallback = ?
是否 tool use = ?
是否 context compression = ?

这些信息决定 Hermes 是否应该启用 DeepSeek-specific replay logic。

和 OpenAI-compatible API 中转有什么关系？

DeepAI API 中转站这类 OpenAI-compatible API 聚合服务，适合解决：

多模型统一 Base URL；
API Key 集中管理；
Cherry Studio、Cline、Dify、Open WebUI 等工具统一接入；
模型路由、计费和可用性管理；
减少不同供应商配置分散。

但这类 reasoning_content must be passed back 问题有一个前提：客户端必须正确处理模型协议。

换句话说，如果 Hermes 或其它 agent 在 replay 历史消息时丢掉了 thinking 字段，换成任何中转站都不一定能自动修好。中转站可以统一入口，但不能替客户端猜出被删掉的历史 reasoning 字段。

正确的分层是：

1. 先让客户端完整支持 DeepSeek thinking-mode replay； 2. 再通过 DeepAI API 中转站统一 OpenAI-compatible Base URL、Key 和模型配置； 3. 如果使用自定义模型名，要确保客户端 detection 能识别它属于 DeepSeek thinking 模型。

这样引导才准确，不会把模型协议 bug 硬说成“换 API 中转就能好”。

临时规避方案

如果你现在急着恢复可用，可以尝试：

升级 Hermes 到最新 main / 最新发布版；
暂时不要把 DeepSeek thinking 模型放在 fallback；
暂时关闭 thinking 模式或换非 thinking 模型；
避免在 DeepSeek thinking 模型下触发复杂 tool use；
用标准 deepseek provider / model 名测试，排除自定义端点 detection 问题；
清理旧会话，避免历史里已有“缺字段”的 assistant message；
打开 errors.log，确认是不是 replay path 仍漏了字段。

这些是绕法，不是最终修复。长期还是要让所有 replay callsite 都统一补齐 thinking 字段。

给开发者的修复建议

如果你维护的是自己的 agent 或 OpenAI-compatible client，可以参考 Hermes 的思路：

1. 集中做 provider/model detection

不要到处写：

if provider == "deepseek":
    ...

更稳的是集中 helper：

def needs_deepseek_reasoning(provider, model, base_url):
    return (
        provider == "deepseek"
        or "deepseek" in model.lower()
        or "api.deepseek.com" in base_url.lower()
    )

2. 创建 assistant message 时就补字段

不要等到 API 调用前才临时修。消息一旦进入 conversation history，就应该保持协议完整。

3. replay 时再次兜底

历史数据可能来自旧版本、压缩结果、外部导入或缓存，所以 replay 前再 pad 一次是必要的。

4. 覆盖 tool-call 和 text-only 两类 assistant message

只修普通文本消息不够。tool-call message 更容易漏字段。

5. 覆盖 fallback、cron、summary、Anthropic adapter 等路径

agent 的复杂性不在主路径，而在所有“看起来不是聊天”的辅助路径。

FAQ

reasoning_content 必须回传是什么意思？

意思是 DeepSeek thinking mode 的 assistant 历史消息中，服务端期望看到上一轮返回的 reasoning / thinking 字段。客户端在下一轮请求里不能把它丢掉。

为什么我第一次问就报这个错？

有时 agent 初始化、工具调用、系统消息整理或已有 session replay 会让“第一次可见提问”并不是 API 层面的第一条消息。也可能是历史会话里已有缺字段消息。

为什么不用工具时正常，一用工具就报错？

工具调用会产生 assistant tool-call message 和 tool result message。很多客户端只处理了 text-only assistant message，漏了 tool-call turn 的 reasoning 字段。

DeepSeek 作为 fallback 才报错怎么办？

检查 fallback 激活后是否仍启用了 DeepSeek reasoning replay helper。主模型路径修好不代表 fallback 路径也修好。

DeepAI API 中转站能解决这个问题吗？

DeepAI API 中转站适合统一 OpenAI-compatible API 的 Base URL、Key 和模型管理，但如果客户端 replay 历史消息时丢失 thinking 字段，中转站不能凭空恢复。先修客户端协议处理，再接入统一中转。

小结

DeepSeek thinking mode 报：

reasoning_content must be passed back
content[].thinking must be passed back

核心不是“模型坏了”，而是：

> assistant 历史消息里的 thinking / reasoning 字段在后续请求中没有被完整回传。

对 Hermes Agent 这类工具型 agent 来说，修复必须覆盖主聊天、工具调用、fallback、cron、压缩和各种 adapter。只在一处补字段，很容易变成“普通聊天好了，工具一用又坏”。

如果你在 DeepAI API 中转站、OpenRouter、自建 New API 或其它 OpenAI-compatible endpoint 上使用 DeepSeek thinking 模型，也建议按这个思路检查：统一入口很重要，但模型协议字段同样不能丢。