记忆与上下文管理在Agent中的实际运用

tunsuy

发布于 2026-04-09 11:07:34

1200

文章被收录于专栏：有文化的技术人有文化的技术人

本文详细分析 trpc-agent-go 框架中会话上下文的管理机制，包括 LLM 提示词的注入内容、Memory 记忆系统、Session Summary 会话摘要以及跨 Session 信息共享等核心设计。

1. 会话上下文的构成

1.1 LLM 提示词中注入的信息

当一次用户请求发送到 LLM 时，ContentRequestProcessor 会将多种信息注入到提示词中。整体构成如下：

┌─────────────────────────────────────────────────────────────────┐
│                    LLM Request Messages                         │
├─────────────────────────────────────────────────────────────────┤
│  1. System Prompt（系统提示词）                                   │
│     - Agent 的角色定义和行为指导                                  │
│                                                                  │
│  2. Preloaded Memory（预加载记忆）← 按 UserID 从 MemoryService 加载│
│     - 用户画像、偏好、历史信息                                    │
│                                                                  │
│  3. Session Summary（会话摘要）← 从当前 Session 获取              │
│     - 之前对话的压缩摘要                                          │
│                                                                  │
│  4. History Messages（历史消息）← 从 Session.Events 获取          │
│     - 未被摘要覆盖的近期对话                                      │
│                                                                  │
│  5. Current User Message（当前用户消息）                          │
│     - 本次请求的用户输入                                          │
└─────────────────────────────────────────────────────────────────┘

1.2 消息注入的代码流程

消息的注入顺序在 internal/flow/processor/content.go 的 ProcessRequest 方法中：

// content.go:229-339
func (p *ContentRequestProcessor) ProcessRequest(
    ctx context.Context,
    invocation *agent.Invocation,
    req *model.Request,
    ch chan<- *event.Event,
) {
    // 1. 注入 InjectedContextMessages（手动注入的上下文）
    p.injectInjectedContextMessages(invocation, req)

    // 2. 预加载 Memory 到 System Prompt 之后
    if p.PreloadMemory != 0 && invocation.MemoryService != nil {
        if memMsg := p.getPreloadMemoryMessage(ctx, invocation); memMsg != nil {
            // 插入到最后一个 system message 之后
            systemMsgIndex := findLastSystemMessageIndex(req.Messages)
            req.Messages = append(req.Messages[:systemMsgIndex+1],
                append([]model.Message{*memMsg}, req.Messages[systemMsgIndex+1:]...)...)
        }
    }

    // 3. 注入 Session Summary（如果启用）
    if summaryMsg != nil {
        systemMsgIndex := findLastSystemMessageIndex(req.Messages)
        req.Messages = append(req.Messages[:systemMsgIndex+1],
            append([]model.Message{*summaryMsg}, req.Messages[systemMsgIndex+1:]...)...)
    }

    // 4. 追加历史消息
    messages = p.getIncrementMessages(invocation, summaryUpdatedAt)
    req.Messages = append(req.Messages, messages...)

    // 5. 追加当前用户消息（如果历史中没有包含）
    if invocation.Message.Content != "" && needToAddInvocationMessage {
        req.Messages = append(req.Messages, invocation.Message)
    }
}

1.3 配置选项

ContentRequestProcessor 提供了多种配置选项来控制上下文的构成：

选项	默认值	说明
PreloadMemory	0 (禁用)	预加载记忆数量：0=禁用，-1=全部，N>0=最近N条
AddSessionSummary	false	是否添加会话摘要
MaxHistoryRuns	0 (无限制)	历史消息最大数量（仅在 Summary 禁用时生效）
TimelineFilterMode	"all"	时间线过滤：all/request/invocation
BranchFilterMode	"prefix"	分支过滤：prefix/all/exact

2. Preloaded Memory（预加载记忆）

2.1 什么是 Preloaded Memory？

Preloaded Memory 是从 MemoryService 中读取并注入到 LLM 提示词的用户长期记忆。这些记忆「按 UserID 存储」，而非 SessionID，因此可以「跨 Session 共享」。

2.2 Memory 的来源

Memory 有两种来源：

「1. 手动添加」：通过 Memory 工具或 API 直接添加

// 用户主动说"记住我喜欢喝咖啡"
// Agent 调用 memory_add 工具
memService.AddMemory(ctx, userKey, "用户喜欢喝咖啡", []string{"偏好", "饮品"})

「2. 自动提取」：通过 MemoryExtractor 从对话中自动提取

// 框架在会话结束后自动分析对话
// 提取有价值的用户信息
extractor.Extract(ctx, messages, existingMemories)

2.3 Memory 的存储结构

Memory 按 AppName + UserID 为 Key 存储（「注意：没有 SessionID！」）：

// memory/memory.go:119-123
type UserKey struct {
    AppName string  // 应用名
    UserID  string  // 用户ID（没有 SessionID）
}

这意味着：

用户 "user-123" 的所有 Session 共享同一份 Memory
├── Session A → 提取记忆: "用户正在学习Python"
├── Session B → 自动加载 Memory: "用户正在学习Python" ← 跨 Session 共享！
└── Session C → 继续累积记忆...

2.4 Memory 的 Entry 结构

// memory/memory.go:86-100
type Entry struct {
    ID        string    `json:"id"`         // 唯一标识符
    AppName   string    `json:"app_name"`   // 应用名
    Memory    *Memory   `json:"memory"`     // 记忆内容
    UserID    string    `json:"user_id"`    // 用户ID
    CreatedAt time.Time `json:"created_at"`// 创建时间
    UpdatedAt time.Time `json:"updated_at"`// 更新时间
}

type Memory struct {
    Memory      string     `json:"memory"`                 // 记忆内容
    Topics      []string   `json:"topics,omitempty"`       // 记忆主题标签
    LastUpdated *time.Time `json:"last_updated,omitempty"`// 最后更新时间
}

2.5 预加载 Memory 的注入格式

当启用 PreloadMemory 时，Memory 会被格式化为 System Message 注入：

// content.go:1067-1079
func formatMemoryContent(memories []*memory.Entry) string {
    var sb strings.Builder
    sb.WriteString("## User Memories\n\n")
    sb.WriteString("The following are memories about the user:\n\n")
    for _, mem := range memories {
        if mem == nil || mem.Memory == nil {
            continue
        }
        fmt.Fprintf(&sb, "ID: %s\n", mem.ID)
        fmt.Fprintf(&sb, "Memory: %s\n\n", mem.Memory.Memory)
    }
    return sb.String()
}

「最终注入的格式：」

## User Memories

The following are memories about the user:

ID: mem-001
Memory: User is learning Go language

ID: mem-002
Memory: User works as a backend developer

ID: mem-003
Memory: User prefers concise explanations

2.6 配置预加载数量

// PreloadMemory 配置说明：
// 0 = 禁用预加载（默认）
// -1 = 加载所有记忆（警告：可能大量消耗 token）
// N > 0 = 加载最近 N 条记忆

agent.WithContentRequestOptions(
    processor.WithPreloadMemory(10),  // 预加载最近10条记忆
),

3. Memory 自动提取机制

3.1 整体流程

会话结束 → 检查是否触发提取（Checker）→ 构建提取请求 → LLM 分析对话 → 调用工具写入 Memory

3.2 触发条件（Checker）

提取不是每次都触发，需要配置 Checker 来决定：

// extractor/checker.go:37
type Checker func(ctx *ExtractionContext) bool

「内置的 Checker：」

消息数超过 n 条时触发

「ExtractionContext 结构：」

// extractor/checker.go:21-31
type ExtractionContext struct {
    UserKey       memory.UserKey   // 用户标识
    Messages      []model.Message  // 实际消息内容
    LastExtractAt *time.Time       // 上次提取时间
}

「配置示例：」

extractor.NewExtractor(model,
    extractor.WithChecker(extractor.CheckMessageThreshold(5)),   // 超过5条消息
    extractor.WithChecker(extractor.CheckTimeInterval(time.Hour)), // 且距上次超过1小时
)

3.3 提取规则（核心：Prompt）

提取什么内容完全由 「Prompt」 决定（extractor/memory.go:220-256）：

You are a Memory Manager for an AI Assistant.
Your task is to analyze the conversation and manage user memories.

<instructions>
1. Analyze the conversation to identify any new or updated information about the user
2. Check if this information is already captured in existing memories
3. Determine if any memories need to be added, updated, or deleted
4. You can call multiple tools in parallel
5. Use the available tools to make the necessary changes
</instructions>

<guidelines>
- Create memories in the third person, e.g., "User enjoys hiking on weekends."
- Keep each memory focused on a single piece of information
- Use update when information changes
- Only use delete when the user explicitly asks to forget something
- Do not create memories for:
  - Transient requests or questions           ← 不记录临时性问题
  - Information already captured              ← 不重复记录
  - Generic conversation that doesn't reveal personal information  ← 不记录无个人信息的对话
</guidelines>

<memory_types>
Capture meaningful personal information such as:
- Personal details: name, age, location, occupation    ← 个人信息
- Preferences: likes, dislikes, favorites              ← 偏好
- Interests and hobbies                                ← 兴趣爱好
- Goals and aspirations                                ← 目标
- Important relationships                              ← 重要关系
- Significant life events                              ← 重大事件
- Opinions and beliefs                                 ← 观点信念
- Work and education background                        ← 工作教育背景
</memory_types>

3.4 提取过程详解

┌─────────────────────────────────────────────────────────────────┐
│  Step 1: 构建请求                                                │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │ System: [Memory Manager Prompt]                             │ │
│  │ <existing_memories>                                         │ │
│  │ - [mem-001] User is learning Go language                   │ │
│  │ - [mem-002] User works as a backend developer              │ │
│  │ </existing_memories>                                        │ │
│  │                                                             │ │
│  │ User: 我最近在研究微服务架构，准备用 trpc 重构项目           │ │
│  │ Assistant: trpc 是一个很好的选择...                         │ │
│  │ User: 对了，我下个月要去杭州出差                            │ │
│  │ Assistant: 杭州很不错...                                    │ │
│  └────────────────────────────────────────────────────────────┘ │
│  Tools: [memory_add, memory_update, memory_delete, memory_clear] │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Step 2: LLM 分析并返回工具调用                                  │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │ Tool Call 1: memory_add                                     │ │
│  │   {                                                         │ │
│  │     "memory": "User is studying microservice architecture   │ │
│  │                and plans to refactor project using trpc",   │ │
│  │     "topics": ["microservices", "trpc", "architecture"]     │ │
│  │   }                                                         │ │
│  │                                                             │ │
│  │ Tool Call 2: memory_add                                     │ │
│  │   {                                                         │ │
│  │     "memory": "User has a business trip to Hangzhou next    │ │
│  │                month",                                      │ │
│  │     "topics": ["travel", "hangzhou"]                        │ │
│  │   }                                                         │ │
│  └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Step 3: 解析工具调用，写入 MemoryService                        │
│  ├── Operation{Type: Add, Memory: "...", Topics: [...]}        │
│  └── Operation{Type: Add, Memory: "...", Topics: [...]}        │
└─────────────────────────────────────────────────────────────────┘

3.5 支持的操作类型

操作	工具名	触发场景
「Add」	memory_add	发现新的用户信息
「Update」	memory_update	用户信息有更新（需要 memory_id）
「Delete」	memory_delete	用户明确要求遗忘某信息
「Clear」	memory_clear	用户要求清空所有记忆

3.6 与已有记忆的去重

提取时会传入 existing []*memory.Entry（已有记忆），LLM 会：

检查是否与已有记忆重复
决定是 add 新记忆还是 update 已有记忆

// extractor/memory.go:192-206
func (e *memoryExtractor) buildSystemPrompt(existing []*memory.Entry) string {
    // ... 将已有记忆添加到 prompt 中
    sb.WriteString("\n<existing_memories>\n")
    for _, entry := range existing {
        fmt.Fprintf(&sb, "- [%s] %s\n", entry.ID, entry.Memory.Memory)
    }
    sb.WriteString("</existing_memories>\n")
}

3.7 自定义提取规则

可以通过 WithPrompt 自定义提取逻辑：

customPrompt := `你是一个专注于技术学习的记忆管理器。
只提取以下类型的信息：
- 用户正在学习的技术栈
- 用户的项目经验
- 用户遇到的技术难点
不要记录：
- 个人生活信息
- 临时性问题
`

extractor.NewExtractor(model,
    extractor.WithPrompt(customPrompt),
)

4. Session Summary（会话摘要）

4.1 什么是 Session Summary？

Session Summary 是对会话历史的压缩摘要，存储在 Session 对象中。当历史消息过长时，可以用摘要替代旧消息，减少 token 消耗。

4.2 Summary 的存储位置

Summary 存储在 Session.Summaries 字段中，按 filterKey 分类：

// session/session.go:44-56
type Session struct {
    ID        string                 `json:"id"`
    AppName   string                 `json:"appName"`
    UserID    string                 `json:"userID"`
    Events    []event.Event          `json:"events"`
    // Summaries 按 filterKey 存储摘要，key 为 event filter key
    Summaries map[string]*Summary    `json:"summaries,omitempty"`
    UpdatedAt time.Time              `json:"updatedAt"`
    CreatedAt time.Time              `json:"createdAt"`
}

// session/summary.go:48-58
type SessionSummary struct {
    ID        string         `json:"id"`
    Summary   string         `json:"summary"`
    CreatedAt time.Time      `json:"created_at"`
    Metadata  map[string]any `json:"metadata"`
}

4.3 何时生成摘要（触发条件）

摘要生成由 SessionSummarizer 的 Checker 决定：

// summary/checker.go:22
type Checker func(sess *session.Session) bool

「内置的 Checker：」

Session 中事件数超过 n 时触发

「配置示例：」

summarizer := summary.NewSummarizer(model,
    summary.WithChecks(
        summary.CheckEventThreshold(30),      // 事件超过30个
        summary.CheckTokenThreshold(10000),   // 或 token 超过1万
    ),
    summary.WithMaxSummaryWords(200),         // 摘要最多200词
)

4.4 摘要生成过程

// summary/summarizer.go:194-258
func (s *sessionSummarizer) Summarize(ctx context.Context, sess *session.Session) (string, error) {
    // 1. 过滤要摘要的事件（可配置跳过最近 N 条）
    eventsToSummarize := s.filterEventsForSummary(sess.Events)

    // 2. 提取对话文本
    conversationText := s.extractConversationText(eventsToSummarize)

    // 3. 调用 LLM 生成摘要
    summaryText, err := s.generateSummary(ctx, conversationText)

    // 4. 记录最后包含的事件时间戳
    s.recordLastIncludedTimestamp(sess, eventsToSummarize)

    return summaryText, nil
}

4.5 摘要的注入格式

当启用 AddSessionSummary 时，摘要会被格式化并注入：

// content.go:429-439
func (p *ContentRequestProcessor) formatSummary(summary string) string {
    return fmt.Sprintf("Here is a brief summary of your previous interactions:\n\n"+
        "<summary_of_previous_interactions>\n%s\n</summary_of_previous_interactions>\n\n"+
        "Note: this information is from previous interactions and may be outdated. "+
        "You should ALWAYS prefer information from this conversation over the past summary.\n", summary)
}

「最终注入的格式：」

Here is a brief summary of your previous interactions:

<summary_of_previous_interactions>
用户之前询问了 Go 语言的并发模型，讨论了 goroutine 和 channel 的使用方法。
用户对性能优化很感兴趣，特别关注内存分配问题。
</summary_of_previous_interactions>

Note: this information is from previous interactions and may be outdated.
You should ALWAYS prefer information from this conversation over the past summary.

4.6 摘要与历史消息的关系

当启用 AddSessionSummary 时，历史消息的处理逻辑会改变：

┌─────────────────────────────────────────────────────────────────┐
│                     Session.Events 时间线                        │
│                                                                 │
│  [旧事件]────────────[Summary UpdatedAt]────────────[新事件]     │
│     │                       │                          │        │
│     │                       │                          │        │
│     ▼                       ▼                          ▼        │
│  已被摘要覆盖            摘要时间点               未被摘要的消息   │
│  （不加入上下文）                                 （加入上下文）   │
└─────────────────────────────────────────────────────────────────┘

「代码逻辑：」

// content.go:443-513
func (p *ContentRequestProcessor) getIncrementMessages(inv *agent.Invocation, since time.Time) []model.Message {
    // since = Summary 的 UpdatedAt 时间
    // 只包含 since 之后的事件
    for _, evt := range inv.Session.Events {
        // 跳过 Summary UpdatedAt 之前的事件
        if !isZeroTime && !evt.Timestamp.After(since) {
            return false, false
        }
        // ...
    }
}

5. History Messages（历史消息）

5.1 历史消息的存储位置

历史消息存储在 Session.Events 字段中：

// session/session.go:49
Events   []event.Event          `json:"events"`

每个 Event 代表一次交互（用户消息或助手回复）：

// event/event.go
type Event struct {
    ID             string          `json:"id"`
    InvocationID   string          `json:"invocation_id"`
    Author         string          `json:"author"`           // "user" 或 agent名称
    Response       *model.Response `json:"response,omitempty"`
    RequestID      string          `json:"request_id"`       // 同一请求的多个事件共享
    FilterKey      string          `json:"filter_key"`       // 用于分支过滤
    Timestamp      time.Time       `json:"timestamp"`
    IsPartial      bool            `json:"is_partial"`       // 是否流式中间态
}

5.2 每个问答对应多少历史消息？

「不是每个问答一条历史消息」。一次完整的问答可能产生多个 Event：

用户问: "今天天气怎么样"
│
├── Event 1: User Message（用户消息）
│   └── Response: {Choices: [{Message: "今天天气怎么样"}]}
│
├── Event 2: Tool Call（工具调用，如果有）
│   └── Response: {Choices: [{Message: {ToolCalls: [get_weather]}}]}
│
├── Event 3: Tool Result（工具结果）
│   └── Response: {Choices: [{Message: {ToolID: "...", Content: "晴天 25°C"}}]}
│
└── Event 4: Assistant Response（助手回复）
    └── Response: {Choices: [{Message: "今天是晴天，温度25°C..."}]}

5.3 历史消息的过滤机制

历史消息会根据配置进行过滤：

「1. TimelineFilterMode（时间线过滤）」

// content.go:41-49
const (
    TimelineFilterAll               = "all"        // 所有历史
    TimelineFilterCurrentRequest    = "request"    // 只包含当前请求
    TimelineFilterCurrentInvocation = "invocation" // 只包含当前调用
)

「2. BranchFilterMode（分支过滤）」

// content.go:33-39
const (
    BranchFilterModePrefix = "prefix"  // 前缀匹配
    BranchFilterModeAll    = "all"     // 包含所有
    BranchFilterModeExact  = "exact"   // 精确匹配
)

「3. MaxHistoryRuns（最大历史数）」

// content.go:505-511
if !p.AddSessionSummary && p.MaxHistoryRuns > 0 &&
    len(messages) > p.MaxHistoryRuns {
    startIdx := len(messages) - p.MaxHistoryRuns
    messages = messages[startIdx:]  // 截断旧消息
}

5.4 历史消息的转换与合并

历史消息从 Event 转换为 Message 时会进行处理：

「1. 外部 Agent 回复转换为用户上下文」

// 如果消息来自其他 Agent 或其他分支，转换为用户消息
if p.isOtherAgentReply(inv.AgentName, inv.Branch, &ev) {
    ev = p.convertForeignEvent(&ev)  // 转换为 "For context: ..." 格式
}

「2. Reasoning Content 处理」

// 处理 DeepSeek 等模型的思维链内容
// 默认丢弃之前请求的 reasoning_content
msg = p.processReasoningContent(msg, evt.RequestID, currentRequestID)

「3. 连续用户消息合并」

// 合并连续的 "For context:" 消息，避免重复前缀
messages = p.mergeUserMessages(messages)

6. 跨 Session 信息共享机制

6.1 默认情况：Session 之间是隔离的

「当前 Session 的会话请求不会直接包含其他 Session 的历史消息」。每个 Session 是独立的：

Session A (sessionID: "chat-001")
└── Events: [用户问了什么是Python, LLM回复了...]

Session B (sessionID: "chat-002")  ← 当前会话
└── Events: [用户问了Python怎么写循环, LLM回复了...]
   （不会自动包含 Session A 的内容）

6.2 跨 Session 共享的机制：Memory

「Memory 是 trpc-agent-go 解决跨 Session 信息共享的核心方案。」

Memory 是按 AppName + UserID 存储的，而不是按 SessionID：

用户 "user-123" 的所有 Session 共享同一份 Memory
├── Session A → 提取记忆: "用户正在学习Python"
├── Session B → 自动加载 Memory: "用户正在学习Python" ← 跨 Session 共享！
└── Session C → 继续累积记忆...

6.3 跨 Session 话题关联的解决方案

方案	说明	适用场景
「Memory（推荐）」	从对话中提取用户画像，跨 Session 持久化	长期用户偏好、学习进度、工作上下文
「同一 SessionID」	不创建新 Session，复用旧的	明确知道是同一话题的延续
「手动传入上下文」	业务层在请求时注入前序 Session 的摘要	精确控制跨 Session 信息

6.4 Memory 跨 Session 共享示意图

┌─────────────────────────────────────────────────────────────────┐
│                     Session A (已结束)                           │
│  对话: "我在学Go语言，准备转型后端开发"                            │
│  └─→ 提取 Memory: "用户正在学习Go语言，目标是后端开发"             │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼ 存储到 MemoryService (按 AppName+UserID)
                              │
┌─────────────────────────────────────────────────────────────────┐
│                     Session B (新会话)                           │
│  用户: "怎么处理并发？"                                           │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │ System Prompt:                                              │ │
│  │ ...                                                         │ │
│  │ <memories_about_user>                                       │ │
│  │ - 用户正在学习Go语言，目标是后端开发  ← 来自 Session A       │ │
│  │ </memories_about_user>                                      │ │
│  │ ...                                                         │ │
│  └────────────────────────────────────────────────────────────┘ │
│  LLM: "在Go语言中，处理并发可以使用 goroutine 和 channel..."     │
└─────────────────────────────────────────────────────────────────┘

6.5 如果两个 Session 话题有关系的最佳实践

「场景」: 用户在 Session A 讨论了项目需求，在 Session B 继续讨论实现细节。

「方案 1」: 启用 Memory 自动提取

agent.WithMemoryService(memService),
agent.WithContentRequestOptions(
    processor.WithPreloadMemory(-1),  // 预加载所有记忆
),
agent.WithMemoryExtractor(extractor), // 自动提取记忆

「方案 2」: 复用同一 Session

// 不创建新 SessionID，继续使用旧的
sess, _ := sessService.GetOrCreate(ctx, appName, userID, "same-session-id")

「方案 3」: 业务层注入上下文

// 手动查询前序 Session 的摘要，作为上下文传入
prevSummary, _ := sessService.GetSessionSummaryText(ctx, prevSession)
agent.Run(ctx, &event.Event{
    Request: &model.Request{
        Messages: []model.Message{
            {Role: "system", Content: "前序对话摘要: " + prevSummary},
            {Role: "user", Content: currentUserInput},
        },
    },
}, invocation)

7. 同一 Session 中话题切换问题

7.1 当前机制：会全部加入上下文

「默认情况下，同一 Session 的历史消息和摘要会全部加入上下文」：

┌─────────────────────────────────────────────────────────────────┐
│  用户问: "今天天气怎么样"                                         │
│  ↓                                                               │
│  System Prompt                                                   │
│  + Session Summary (如果有)                                       │
│  + History Messages:                                              │
│    - [之前讨论Go语言的10轮对话]  ← 可能无关                       │
│    - [用户问天气]                                                  │
│  ↓                                                               │
│  发送给 LLM                                                       │
└─────────────────────────────────────────────────────────────────┘

7.2 大模型的抗干扰能力

现代大模型实际上具备较强的「上下文相关性评估能力」，这是它们的核心能力之一。

7.2.1 注意力机制天然支持相关性评估

Transformer 架构的 「Self-Attention」 机制会自动计算当前 token 与上下文中所有 token 的相关性权重：

用户问: "今天天气怎么样"

上下文:
├── [之前讨论 Go 并发的对话] → 注意力权重低 → 影响小
├── [之前提到用户在杭州]    → 注意力权重高 → "杭州天气"
└── [当前问题: 天气]        → 注意力权重最高

模型会「自动」给相关信息更高的权重，给无关信息更低的权重。

7.2.2 指令遵循能力

现代指令微调后的模型（GPT-4、Claude、Qwen 等）会：

识别用户当前问题的意图
优先基于当前问题回答
不会被无关上下文"带偏"（大多数情况下）

7.2.3 实际表现

场景	模型表现	说明
话题跳跃但有逻辑关联	✅ 处理很好	例如从"Go并发"跳到"Go内存管理"
话题完全无关	✅ 通常能正确处理	例如从"Go并发"跳到"今天天气"
上下文中有误导信息	⚠️ 可能受影响	例如上下文说"用户喜欢Python"但当前问Go
上下文过长	⚠️ 可能遗漏	"Lost in the Middle" 问题

7.3 既然模型有抗干扰能力，为什么还需要截断/摘要？

虽然大模型有较强的抗干扰能力，框架仍然提供 MaxHistoryRuns、Summary 等机制，原因如下：

7.3.1 Token 成本

无关历史虽然不影响回答质量，但「浪费 token」：

输入 token 按量计费
10轮无关对话 ≈ 几千 token ≈ 真金白银

7.3.2 响应延迟

上下文越长，推理越慢：

注意力计算是 O(n²) 复杂度
长上下文 = 更慢的首 token 时间

7.3.3 "Lost in the Middle" 问题

研究表明，当上下文很长时，模型对「中间部分」的信息关注度下降：

[开头信息] ← 关注度高
[中间信息] ← 关注度低（可能被忽略）
[结尾信息] ← 关注度高

如果重要信息恰好在中间，可能被遗漏。

7.3.4 边缘情况兜底

虽然大多数时候模型能正确处理，但某些边缘情况可能出问题：

上下文中有强烈的"暗示"与当前问题冲突
用户问题措辞模糊，模型可能参考历史"猜测"意图

7.4 结论：何时需要关注话题切换

「大多数场景下不需要担心话题切换问题」，现代大模型能够正确处理。

框架提供截断/摘要机制主要是为了：

「节省成本」（token 费用）
「降低延迟」（更快响应）
「兜底保障」（处理极端情况）

如果你的应用场景满足以下条件，「不做任何特殊处理」也可以：

对话轮次不多（< 20 轮）
对成本不敏感
对延迟要求不高

7.5 框架提供的缓解方案

方案 1: `MaxHistoryRuns` 截断

只保留最近 N 条消息：

llmagent.WithMaxHistoryRuns(5),  // 只保留最近5条消息

// content.go:505-511
if !p.AddSessionSummary && p.MaxHistoryRuns > 0 &&
    len(messages) > p.MaxHistoryRuns {
    startIdx := len(messages) - p.MaxHistoryRuns
    messages = messages[startIdx:]  // 截断旧消息
}

方案 2: 使用 Summary 压缩

开启 AddSessionSummary 后，旧消息会被压缩成摘要：

llmagent.WithAddSessionSummary(true),

历史消息被分为两部分：

「已摘要的」：以 Summary 形式注入（简短）
「未摘要的」：作为完整消息注入（最近的）

方案 3: 新建 Session

话题完全无关时，「新建一个 Session」 是最干净的做法：

// 新话题 → 新 Session
newSessionID := "weather-chat-001"
sess := session.NewSession(appName, userID, newSessionID)

7.6 框架没有内置的智能话题切换

trpc-agent-go 「不会」自动检测话题切换并过滤无关历史。这需要：

「业务层实现」：检测话题变化 → 创建新 Session
「自定义 Prompt」：告诉 LLM "只关注最新问题，忽略无关历史"
「RAG 方案」：用向量检索只加载相关历史

7.7 实际影响评估

场景	干扰程度	建议
话题相关但有跳跃	低	LLM 通常能正确处理
话题完全无关	中	用 MaxHistoryRuns 限制
敏感话题切换	高	新建 Session
Token 敏感	高	用 Summary + 截断

7.8 推荐配置

ag := llmagent.New(
    "my-agent",
    // 开启摘要，压缩旧历史
    llmagent.WithAddSessionSummary(true),
    // 限制最近消息数
    llmagent.WithMaxHistoryRuns(10),
)

或者在 System Prompt 中加入指导：

When answering questions, prioritize the user's current question.
If the current question is unrelated to previous conversations,
focus only on the current question and ignore previous context.

8. 总结与最佳实践

8.1 上下文构成速查表

组成部分	存储位置	跨 Session	配置项
System Prompt	代码配置	-	Agent 定义
Preloaded Memory	MemoryService	✅ 按 UserID	PreloadMemory
Session Summary	Session.Summaries	❌ 按 SessionID	AddSessionSummary
History Messages	Session.Events	❌ 按 SessionID	MaxHistoryRuns
Current Message	Invocation.Message	-	-

8.2 Memory vs Summary vs History 对比

特性	Memory	Summary	History
「作用域」	用户级（跨 Session）	Session 级	Session 级
「内容类型」	用户画像/偏好	对话摘要	原始对话
「生成方式」	LLM 提取/手动添加	LLM 压缩	自动记录
「Token 消耗」	低（精炼）	中（压缩）	高（完整）
「信息完整度」	低（概括）	中（摘要）	高（完整）
「适用场景」	长期用户画像	长对话压缩	短期上下文

8.3 典型配置场景

「场景 1：短对话，注重上下文完整性」

agent.WithContentRequestOptions(
    processor.WithPreloadMemory(10),       // 预加载最近10条记忆
    processor.WithAddSessionSummary(false), // 不用摘要
    processor.WithMaxHistoryRuns(0),        // 不限制历史
),

「场景 2：长对话，注重 Token 节省」

agent.WithContentRequestOptions(
    processor.WithPreloadMemory(5),        // 预加载5条记忆
    processor.WithAddSessionSummary(true), // 启用摘要
),
agent.WithSummarizer(summarizer),          // 配置摘要器

「场景 3：多 Session 场景，注重跨会话连贯性」

agent.WithMemoryService(memService),       // 启用 Memory 服务
agent.WithMemoryExtractor(extractor),      // 启用自动提取
agent.WithContentRequestOptions(
    processor.WithPreloadMemory(-1),       // 预加载所有记忆
),

8.4 设计要点回顾

「Session 隔离是设计选择」，避免无关历史干扰当前对话
「Memory 是跨 Session 共享信息的官方机制」，按用户维度存储
「Summary 是 Token 优化手段」，压缩旧历史
「话题切换需要业务层处理」，框架不自动检测
「配置项要根据场景选择」，没有万能配置

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2026-03-05，如有侵权请联系 cloudcommunity@tencent.com 删除

agent

本文分享自有文化的技术人微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！