RAG 在实际 Agent 业务中的实现方案

tunsuy

发布于 2026-04-09 10:10:09

1740

1. RAG 基础概念

1.1 什么是 RAG？

RAG（Retrieval-Augmented Generation，检索增强生成）是一种将「外部知识检索」与「大语言模型生成」相结合的技术架构。

┌──────────────────────────────────────────────────────────────────────────────┐
│                          RAG 核心思想                                         │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   问题：LLM 的知识是训练时固化的，无法获取实时/私有数据                          │
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │  传统 LLM                                                           │   │
│   │                                                                     │   │
│   │  用户问："公司最新的休假政策是什么？"                                 │   │
│   │                     │                                               │   │
│   │                     ▼                                               │   │
│   │              LLM（不知道）                                          │   │
│   │                     │                                               │   │
│   │                     ▼                                               │   │
│   │              "抱歉，我没有这方面的信息"                               │   │
│   │                                                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │  RAG 增强                                                           │   │
│   │                                                                     │   │
│   │  用户问："公司最新的休假政策是什么？"                                 │   │
│   │                     │                                               │   │
│   │                     ▼                                               │   │
│   │          ┌─────────────────────┐                                   │   │
│   │          │  向量检索知识库      │ ← 包含公司文档、政策等              │   │
│   │          └─────────────────────┘                                   │   │
│   │                     │                                               │   │
│   │                     ▼ 检索到相关文档片段                             │   │
│   │          ┌─────────────────────┐                                   │   │
│   │          │  LLM + 检索结果     │                                   │   │
│   │          └─────────────────────┘                                   │   │
│   │                     │                                               │   │
│   │                     ▼                                               │   │
│   │              "根据公司2024年最新政策，年假..."                        │   │
│   │                                                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

1.2 RAG vs Fine-tuning

维度	RAG	Fine-tuning
「知识更新」	实时更新，修改文档即可	需要重新训练模型
「成本」	低（只需存储和检索）	高（需要 GPU 训练）
「可解释性」	高（可以追溯来源）	低（知识融入参数）
「幻觉问题」	基于真实文档，幻觉较少	可能产生虚构内容
「适用场景」	事实性问答、知识库	风格迁移、特定任务

1.3 RAG 完整流程

┌──────────────────────────────────────────────────────────────────────────────┐
│                          RAG 完整流程                                         │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   【离线阶段：构建知识库】                                                     │
│                                                                              │
│   文档源 ──────► 文档解析 ──────► 文本分块 ──────► 向量化 ──────► 向量存储      │
│   (PDF/MD/...)  (提取文本)     (Chunking)      (Embedding)    (VectorDB)    │
│                                                                              │
│   ─────────────────────────────────────────────────────────────────────────  │
│                                                                              │
│   【在线阶段：检索生成】                                                       │
│                                                                              │
│   用户查询                                                                    │
│       │                                                                      │
│       ▼                                                                      │
│   查询增强（可选）──► 用户问"最新政策" → 扩展为"2024年公司休假政策规定"         │
│       │                                                                      │
│       ▼                                                                      │
│   查询向量化 ──────► 将查询文本转为向量                                        │
│       │                                                                      │
│       ▼                                                                      │
│   向量检索 ────────► 在向量库中找相似文档                                      │
│       │                                                                      │
│       ▼                                                                      │
│   结果重排序（可选）─► 用更精确的模型对结果排序                                 │
│       │                                                                      │
│       ▼                                                                      │
│   构建 Prompt ────► 将检索结果 + 用户问题组合                                  │
│       │                                                                      │
│       ▼                                                                      │
│   LLM 生成回答                                                               │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2. trpc-agent-go 的 RAG 实现架构

本节详细介绍 trpc-agent-go 框架中 RAG 能力的完整实现，包括知识库构建、同步更新、检索流程和会话集成。

2.1 整体架构

┌──────────────────────────────────────────────────────────────────────────────┐
│                    trpc-agent-go RAG 架构                                     │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                        Knowledge（知识库）                           │   │
│   │                                                                     │   │
│   │   核心接口：Search(ctx, SearchRequest) → SearchResponse             │   │
│   │                                                                     │   │
│   └───────────────────────────────┬─────────────────────────────────────┘   │
│                                   │                                         │
│          ┌────────────────────────┼────────────────────────┐                │
│          │                        │                        │                │
│          ▼                        ▼                        ▼                │
│   ┌─────────────┐         ┌─────────────┐         ┌─────────────┐          │
│   │   Source    │         │  Embedder   │         │ VectorStore │          │
│   │  (数据源)    │         │ (向量化模型) │         │ (向量存储)   │          │
│   ├─────────────┤         ├─────────────┤         ├─────────────┤          │
│   │ • File      │         │ • OpenAI    │         │ • InMemory  │          │
│   │ • Directory │         │ • Ollama    │         │ • pgvector  │          │
│   │ • URL       │         │ • HuggingFace│        │ • Milvus    │          │
│   │ • Auto      │         │ • Gemini    │         │ • Qdrant    │          │
│   └─────────────┘         └─────────────┘         │ • ES        │          │
│                                                   │ • TCVector  │          │
│                                                   └─────────────┘          │
│                                                                              │
│          ┌────────────────────────┼────────────────────────┐                │
│          │                        │                        │                │
│          ▼                        ▼                        ▼                │
│   ┌─────────────┐         ┌─────────────┐         ┌─────────────┐          │
│   │  Chunking   │         │  Retriever  │         │  Reranker   │          │
│   │  (文本分块)  │         │  (检索器)    │         │  (重排序)    │          │
│   ├─────────────┤         ├─────────────┤         ├─────────────┤          │
│   │ • Fixed     │         │ • Default   │         │ • TopK      │          │
│   │ • Recursive │         │             │         │ • Cohere    │          │
│   │ • Markdown  │         │             │         │ • Infinity  │          │
│   │ • JSON      │         │             │         │             │          │
│   └─────────────┘         └─────────────┘         └─────────────┘          │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2.2 知识库的构建

2.2.1 构建流程

┌──────────────────────────────────────────────────────────────────────────────┐
│                          知识库构建流程                                        │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   步骤 1：配置组件                                                            │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                                                                     │   │
│   │   kb := knowledge.New(                                              │   │
│   │       knowledge.WithEmbedder(embedder),     // 向量化模型            │   │
│   │       knowledge.WithVectorStore(store),     // 向量存储              │   │
│   │       knowledge.WithSources(sources...),    // 数据源               │   │
│   │       knowledge.WithReranker(reranker),     // 重排序器（可选）      │   │
│   │       knowledge.WithEnableSourceSync(true), // 启用增量同步（可选）  │   │
│   │   )                                                                 │   │
│   │                                                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│   步骤 2：加载数据                                                            │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                                                                     │   │
│   │   err := kb.Load(ctx)                                               │   │
│   │                                                                     │   │
│   │   内部流程：                                                         │   │
│   │   Source.ReadDocuments() → Chunking → Embedder → VectorStore.Add() │   │
│   │                                                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│   步骤 3：搜索使用                                                            │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                                                                     │   │
│   │   resp, err := kb.Search(ctx, &knowledge.SearchRequest{             │   │
│   │       Query:      "休假政策",                                        │   │
│   │       MaxResults: 5,                                                │   │
│   │       MinScore:   0.7,                                              │   │
│   │   })                                                                │   │
│   │                                                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2.2.2 数据源（Source）

框架支持多种数据源类型：

Source 类型	说明	适用场景
「FileSource」	单个文件	加载特定文档
「DirSource」	目录下所有文件	批量加载文档
「URLSource」	网页内容	爬取在线文档
「AutoSource」	自动识别类型	智能加载

// Source 接口定义
type Source interface {
    ReadDocuments(ctx context.Context) ([]*document.Document, error)
    Name() string                    // 数据源名称（用于增量同步）
    Type() string                    // 数据源类型
    GetMetadata() map[string]any     // 元数据（用于生成文档 ID）
}

2.2.3 文本分块（Chunking）

分块策略决定了文档如何被切分成适合检索的片段：

分块策略	说明	适用场景
「FixedChunking」	固定大小分块	通用场景
「RecursiveChunking」	递归分块，按分隔符层级切分	结构化文本
「MarkdownChunking」	按 Markdown 标题层级切分	Markdown 文档
「JSONChunking」	按 JSON 结构切分	JSON 数据

// Chunking 接口
type Chunking interface {
    Chunk(doc *document.Document) ([]*document.Document, error)
}

2.2.4 向量存储（VectorStore）

框架支持多种向量数据库：

VectorStore	特点	适用场景
「InMemory」	内存存储，重启丢失	开发测试
「pgvector」	PostgreSQL 扩展	已有 PG 基础设施
「Milvus」	专业向量数据库，高性能	大规模生产环境
「Qdrant」	轻量级向量数据库	中小规模场景
「Elasticsearch」	支持混合检索（向量+关键词）	需要全文检索
「TCVector」	腾讯云向量数据库	腾讯云环境

// VectorStore 核心接口
type VectorStore interface {
    Add(ctx context.Context, doc *document.Document, embedding []float64) error
    Get(ctx context.Context, id string) (*document.Document, error)
    Update(ctx context.Context, doc *document.Document, embedding []float64) error
    Delete(ctx context.Context, id string) error
    Search(ctx context.Context, query *SearchQuery) ([]*SearchResult, error)
    DeleteByFilter(ctx context.Context, opts ...DeleteOption) error
    Count(ctx context.Context) (int, error)
    GetMetadata(ctx context.Context, opts ...GetMetadataOption) (map[string]DocumentMetadata, error)
}

2.3 知识库的同步更新

知识库的同步更新是 RAG 系统中的核心难题：如何高效地处理文档的新增、修改、删除，同时避免重复处理？

2.3.1 问题分析：为什么同步更新是难题？

┌──────────────────────────────────────────────────────────────────────────────┐
│                    知识库同步更新的核心挑战                                      │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   场景描述：                                                                   │
│   • 数据源：100 个 Markdown 文档                                               │
│   • 向量库：已存储 500 个向量（每个文档分成 5 个 chunk）                         │
│   • 变化：用户修改了 2 个文档，删除了 1 个文档，新增了 3 个文档                  │
│                                                                              │
│   ─────────────────────────────────────────────────────────────────────────  │
│                                                                              │
│   传统方案的问题：                                                             │
│                                                                              │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  方案 A：全量重建                                                   │    │
│   │                                                                    │    │
│   │  每次同步都：清空向量库 → 重新加载所有文档 → 重新生成向量              │    │
│   │                                                                    │    │
│   │  问题：                                                             │    │
│   │  • ❌ 浪费计算资源（98 个未变化的文档白白重新处理）                   │    │
│   │  • ❌ Embedding API 调用成本高                                      │    │
│   │  • ❌ 同步期间知识库不可用                                          │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  方案 B：按文件修改时间判断                                          │    │
│   │                                                                    │    │
│   │  比较文件的 mtime，只处理修改过的文件                                │    │
│   │                                                                    │    │
│   │  问题：                                                             │    │
│   │  • ❌ URL 数据源没有 mtime                                          │    │
│   │  • ❌ 跨系统同步时 mtime 可能不准                                    │    │
│   │  • ❌ 无法检测"内容没变但文件被 touch 过"的情况                      │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   框架采用的方案：基于内容哈希的增量同步                                        │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2.3.2 核心机制：基于内容哈希的文档 ID

框架的核心思路：「用文档内容生成唯一 ID，内容变了 ID 就变，内容没变 ID 就不变」。

┌──────────────────────────────────────────────────────────────────────────────┐
│                    文档 ID 生成策略                                            │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   DocumentID = SHA256(                                                       │
│       SourceName  +     // 数据源名称（区分不同来源的同名文件）                 │
│       URI         +     // 文档路径/URL（区分同一来源的不同文件）               │
│       Content     +     // 文档内容（核心：内容变了 ID 就变）                   │
│       ChunkIndex  +     // 分块索引（区分同一文档的不同 chunk）                 │
│       SourceMetadata    // 数据源元数据（用户自定义的过滤条件）                 │
│   )                                                                          │
│                                                                              │
│   ─────────────────────────────────────────────────────────────────────────  │
│                                                                              │
│   举例说明：                                                                   │
│                                                                              │
│   文档 A：api-guide.md, chunk 0, 内容 = "# API 指南\n调用方式..."               │
│   → ID = sha256("docs" + "api-guide.md" + "# API 指南..." + "0" + "{}")      │
│   → ID = "a1b2c3d4e5f6..."                                                   │
│                                                                              │
│   ─────────────────────────────────────────────────────────────────────────  │
│                                                                              │
│   用户修改了 api-guide.md，内容变成 "# API 指南\n新的调用方式..."               │
│   → ID = sha256("docs" + "api-guide.md" + "# API 指南\n新的..." + "0" + "{}")│
│   → ID = "f6e5d4c3b2a1..."   ← ID 变了！                                     │
│                                                                              │
│   ─────────────────────────────────────────────────────────────────────────  │
│                                                                              │
│   这样判断变化就变得简单：                                                      │
│   • 新 ID 在向量库中不存在 → 新文档或内容有变化，需要处理                        │
│   • 新 ID 在向量库中已存在 → 内容没变，跳过                                     │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

「代码实现」：

// generateDocumentID 生成文档唯一 ID
func generateDocumentID(sourceName, uri, content string, chunkIndex int, sourceMetadata map[string]any) string {
    hasher := sha256.New()
    
    hasher.Write([]byte(sourceName))
    hasher.Write([]byte(":"))
    hasher.Write([]byte(uri))
    hasher.Write([]byte(":"))
    hasher.Write([]byte(content))                      // 关键：内容参与哈希
    hasher.Write([]byte(":"))
    hasher.Write([]byte(strconv.Itoa(chunkIndex)))
    
    // 元数据排序后序列化，保证相同元数据生成相同哈希
    iflen(sourceMetadata) > 0 {
        keys := make([]string, 0, len(sourceMetadata))
        for k := range sourceMetadata {
            keys = append(keys, k)
        }
        sort.Strings(keys)
        for _, k := range keys {
            hasher.Write([]byte(k))
            hasher.Write([]byte(fmt.Sprintf("%v", sourceMetadata[k])))
        }
    }
    
    return hex.EncodeToString(hasher.Sum(nil))
}

2.3.3 增量同步流程详解

启用增量同步：

kb := knowledge.New(
    knowledge.WithEnableSourceSync(true),  // 关键配置
    knowledge.WithVectorStore(vectorStore),
    knowledge.WithEmbedder(embedder),
)

「调用 Load() 时的完整流程」：

┌──────────────────────────────────────────────────────────────────────────────┐
│                       Load() 增量同步流程详解                                   │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│ 步骤 1：构建"向量库现有文档"的索引                                              │
│ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│                                                                              │
│   refreshAllDocInfo()                                                        │
│   │                                                                          │
│   │   从 VectorStore 获取所有已存在文档的元数据                                │
│   │   ┌─────────────────────────────────────────────────────────────────┐   │
│   │   │  vectorStore.GetMetadata(ctx)                                   │   │
│   │   │  返回: { "a1b2c3...": {URI, SourceName, ChunkIndex}, ... }      │   │
│   │   └─────────────────────────────────────────────────────────────────┘   │
│   │                                                                          │
│   └──► 构建三个缓存 Map：                                                     │
│        • cacheMetaInfo:   {文档ID → 文档信息}     // 按 ID 索引               │
│        • cacheURIInfo:    {URI → [文档信息列表]}   // 按文件路径索引           │
│        • cacheSourceInfo: {数据源名 → [文档信息]} // 按数据源索引              │
│                                                                              │
│ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│                                                                              │
│ 步骤 2：遍历数据源，处理每个文档                                                 │
│ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│                                                                              │
│   for 每个数据源中的每个 chunk:                                                 │
│   │                                                                          │
│   │   1. 生成文档 ID（基于内容哈希）                                           │
│   │      docID = generateDocumentID(sourceName, uri, content, chunkIdx, meta)│
│   │                                                                          │
│   │   2. 调用 shouldProcessDocument(doc) 判断是否需要处理                      │
│   │      ┌─────────────────────────────────────────────────────────────┐    │
│   │      │                                                             │    │
│   │      │   检查 1：docID 是否已在 processedDocIDs 中？                │    │
│   │      │           → 是：跳过（本次同步已处理过）                      │    │
│   │      │                                                             │    │
│   │      │   检查 2：docID 是否已在 processingDocIDs 中？               │    │
│   │      │           → 是：跳过（其他协程正在处理）                      │    │
│   │      │                                                             │    │
│   │      │   检查 3：该 URI 是否在向量库中存在？                         │    │
│   │      │           → 否：新文件，需要处理 ✅                          │    │
│   │      │                                                             │    │
│   │      │   检查 4：向量库中该 URI 的文档 ID 是否与新 ID 相同？         │    │
│   │      │           → 是：内容没变，跳过 ⏭️                           │    │
│   │      │           → 否：内容变了，需要处理 ✅                         │    │
│   │      │                                                             │    │
│   │      └─────────────────────────────────────────────────────────────┘    │
│   │                                                                          │
│   │   3. 如果需要处理：                                                       │
│   │      → 生成 Embedding                                                    │
│   │      → 存入 VectorStore（会覆盖旧的同 URI 文档）                          │
│   │      → processedDocIDs.Store(docID)  // 标记为已处理                      │
│   │                                                                          │
│   └──► 处理完成                                                               │
│                                                                              │
│ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│                                                                              │
│ 步骤 3：清理孤儿文档（处理删除场景）                                             │
│ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│                                                                              │
│   cleanupOrphanDocuments()                                                   │
│   │                                                                          │
│   │   "孤儿文档" = 向量库中存在，但本次同步没有处理到的文档                      │
│   │                                                                          │
│   │   产生原因：                                                              │
│   │   • 用户删除了源文件 → 该文件的 chunk 不会被遍历到                          │
│   │   • 用户修改了文件 → 旧 ID 的 chunk 不会被标记为 processed                 │
│   │                                                                          │
│   │   处理逻辑：                                                              │
│   │   ┌─────────────────────────────────────────────────────────────────┐   │
│   │   │  toDelete = []                                                  │   │
│   │   │  for docID in cacheMetaInfo:         // 向量库中的所有文档       │   │
│   │   │      if docID not in processedDocIDs: // 本次没处理到            │   │
│   │   │          toDelete.append(docID)                                 │   │
│   │   │                                                                 │   │
│   │   │  vectorStore.DeleteByFilter(toDelete)  // 批量删除              │   │
│   │   └─────────────────────────────────────────────────────────────────┘   │
│   │                                                                          │
│   └──► 清理完成                                                               │
│                                                                              │
│ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│                                                                              │
│ 步骤 4：刷新缓存                                                                │
│ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│                                                                              │
│   refreshAllDocInfo()                                                        │
│   → 重新从 VectorStore 获取元数据，更新缓存为最新状态                           │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2.3.4 三种变化场景的处理

┌──────────────────────────────────────────────────────────────────────────────┐
│                    场景一：文档内容更新                                          │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   原文档：api-guide.md                                                        │
│   原内容："# API 指南\n旧内容..."                                              │
│   原 ID：a1b2c3...                                                           │
│                                                                              │
│   修改后内容："# API 指南\n新内容..."                                          │
│   新 ID：f6e5d4...（内容变了，ID 变了）                                        │
│                                                                              │
│   同步流程：                                                                   │
│   1. refreshAllDocInfo() → cacheURIInfo["api-guide.md"] = [{ID: a1b2c3...}] │
│   2. 遍历到 api-guide.md，生成新 ID = f6e5d4...                               │
│   3. shouldProcessDocument() 检查：                                           │
│      - URI "api-guide.md" 存在于 cacheURIInfo ✓                              │
│      - 但新 ID f6e5d4... ≠ 旧 ID a1b2c3...                                   │
│      - → 返回 true，需要处理                                                   │
│   4. 生成 Embedding，存入向量库（新 ID）                                        │
│   5. processedDocIDs.Store("f6e5d4...")                                      │
│   6. cleanupOrphanDocuments()：                                               │
│      - a1b2c3... 在 cacheMetaInfo 中 ✓                                       │
│      - a1b2c3... 不在 processedDocIDs 中 ✓                                   │
│      - → 删除 a1b2c3...（旧版本）                                             │
│                                                                              │
│   结果：旧向量被删除，新向量被添加                                               │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────────────────────────┐
│                    场景二：新增文档                                              │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   新增文件：new-feature.md                                                    │
│                                                                              │
│   同步流程：                                                                   │
│   1. refreshAllDocInfo() → cacheURIInfo 中没有 "new-feature.md"              │
│   2. 遍历到 new-feature.md，生成 ID = x1y2z3...                               │
│   3. shouldProcessDocument() 检查：                                           │
│      - URI "new-feature.md" 不存在于 cacheURIInfo                            │
│      - → 返回 true，需要处理（新文件）                                         │
│   4. 生成 Embedding，存入向量库                                                │
│   5. processedDocIDs.Store("x1y2z3...")                                      │
│                                                                              │
│   结果：新向量被添加                                                           │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────────────────────────┐
│                    场景三：删除文档                                              │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   删除文件：old-doc.md（用户在文件系统中删除了这个文件）                          │
│   向量库中存在：old-doc.md 的 3 个 chunk，ID 分别为 p1q2r3, s4t5u6, v7w8x9    │
│                                                                              │
│   同步流程：                                                                   │
│   1. refreshAllDocInfo() → cacheMetaInfo 包含 {p1q2r3, s4t5u6, v7w8x9}       │
│   2. 遍历数据源：old-doc.md 已被删除，不会被遍历到                               │
│   3. processedDocIDs 中不会有 p1q2r3, s4t5u6, v7w8x9                         │
│   4. cleanupOrphanDocuments()：                                               │
│      - p1q2r3 在 cacheMetaInfo 中 ✓，不在 processedDocIDs 中 ✓ → 删除        │
│      - s4t5u6 在 cacheMetaInfo 中 ✓，不在 processedDocIDs 中 ✓ → 删除        │
│      - v7w8x9 在 cacheMetaInfo 中 ✓，不在 processedDocIDs 中 ✓ → 删除        │
│                                                                              │
│   结果：旧文档的所有向量被清理                                                   │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2.3.5 同步触发时机：框架不会自动同步

「重要说明」：框架本身「没有内置定时同步功能」，Load() 方法需要用户主动调用。

┌──────────────────────────────────────────────────────────────────────────────┐
│                    同步触发时机                                                  │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   框架设计理念：                                                                 │
│   • 框架只提供同步能力，不强制同步策略                                            │
│   • 何时同步、多久同步一次，由用户根据业务场景决定                                 │
│   • 这样设计更灵活，适应不同的使用场景                                            │
│                                                                              │
│   ─────────────────────────────────────────────────────────────────────────  │
│                                                                              │
│   常见的同步触发方式：                                                           │
│                                                                              │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  方式 1：服务启动时同步（最常见）                                     │    │
│   │                                                                    │    │
│   │  func main() {                                                     │    │
│   │      kb := knowledge.New(...)                                      │    │
│   │      if err := kb.Load(ctx); err != nil {  // 启动时加载            │    │
│   │          log.Fatalf("load knowledge failed: %v", err)              │    │
│   │      }                                                             │    │
│   │      // 启动服务...                                                 │    │
│   │  }                                                                 │    │
│   │                                                                    │    │
│   │  适用场景：文档变化不频繁，可以接受重启才能看到新文档                   │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  方式 2：定时同步（推荐）                                            │    │
│   │                                                                    │    │
│   │  // 启动后台 goroutine 定时同步                                     │    │
│   │  go func() {                                                       │    │
│   │      ticker := time.NewTicker(1 * time.Hour) // 每小时同步一次       │    │
│   │      defer ticker.Stop()                                           │    │
│   │      for range ticker.C {                                          │    │
│   │          if err := kb.Load(ctx); err != nil {                      │    │
│   │              log.Errorf("sync failed: %v", err)                    │    │
│   │          }                                                         │    │
│   │      }                                                             │    │
│   │  }()                                                               │    │
│   │                                                                    │    │
│   │  适用场景：需要定期更新知识库，但不需要实时                            │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  方式 3：事件驱动同步                                                │    │
│   │                                                                    │    │
│   │  // 监听文件变化事件（如使用 fsnotify）                               │    │
│   │  watcher.Events <- func(event fsnotify.Event) {                    │    │
│   │      if event.Op&fsnotify.Write == fsnotify.Write {                │    │
│   │          kb.Load(ctx)  // 文件变化时触发同步                         │    │
│   │      }                                                             │    │
│   │  }                                                                 │    │
│   │                                                                    │    │
│   │  // 或者通过消息队列接收同步信号                                      │    │
│   │  for msg := range syncChannel {                                    │    │
│   │      kb.Load(ctx)                                                  │    │
│   │  }                                                                 │    │
│   │                                                                    │    │
│   │  适用场景：需要近实时更新，对延迟敏感                                  │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  方式 4：API 触发同步                                                │    │
│   │                                                                    │    │
│   │  // 提供管理接口，允许手动触发同步                                    │    │
│   │  http.HandleFunc("/admin/sync", func(w http.ResponseWriter, r) {   │    │
│   │      if err := kb.Load(ctx); err != nil {                          │    │
│   │          http.Error(w, err.Error(), 500)                           │    │
│   │          return                                                    │    │
│   │      }                                                             │    │
│   │      w.Write([]byte("sync completed"))                             │    │
│   │  })                                                                │    │
│   │                                                                    │    │
│   │  适用场景：运维人员需要手动控制同步时机                                │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2.4 知识检索流程详解

知识检索是 RAG 系统的核心环节。用户的查询如何转化为最相关的文档？本节详细剖析检索的完整流程。

2.4.1 检索流程总览

┌──────────────────────────────────────────────────────────────────────────────┐
│                    Search() 检索流程详解                                       │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   用户调用 kb.Search(ctx, &SearchRequest{Query: "如何请年假"})                 │
│                                                                              │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │                                                                    │    │
│   │  Step 1: 查询增强（Query Enhancement）                              │    │
│   │  ────────────────────────────────────────────────────────────────  │    │
│   │  "如何请年假" → "2024年公司员工年假申请流程 休假政策 请假系统"        │    │
│   │                                                                    │    │
│   │  作用：扩展查询语义，提高召回率                                      │    │
│   │  组件：QueryEnhancer（可选，默认 Passthrough 不做处理）              │    │
│   │                                                                    │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                          │                                                   │
│                          ▼                                                   │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │                                                                    │    │
│   │  Step 2: 查询向量化（Query Embedding）                              │    │
│   │  ────────────────────────────────────────────────────────────────  │    │
│   │  "2024年公司员工年假申请流程..." → [0.12, -0.34, 0.56, ...]        │    │
│   │                                                                    │    │
│   │  作用：将文本转为向量，用于相似度计算                                │    │
│   │  组件：Embedder（OpenAI/Ollama/HuggingFace 等）                    │    │
│   │                                                                    │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                          │                                                   │
│                          ▼                                                   │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │                                                                    │    │
│   │  Step 3: 向量检索（Vector Search）                                  │    │
│   │  ────────────────────────────────────────────────────────────────  │    │
│   │  在向量库中计算余弦相似度，返回 Top-K 最相似的文档                   │    │
│   │                                                                    │    │
│   │  支持多种搜索模式：                                                  │    │
│   │  • SearchModeVector  - 纯向量搜索（默认）                           │    │
│   │  • SearchModeKeyword - 纯关键词搜索                                 │    │
│   │  • SearchModeHybrid  - 混合搜索（向量 + 关键词）                    │    │
│   │  • SearchModeFilter  - 纯过滤（不做相似度计算）                     │    │
│   │                                                                    │    │
│   │  组件：VectorStore（Milvus/pgvector/Qdrant/ES 等）                 │    │
│   │                                                                    │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                          │                                                   │
│                          ▼                                                   │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │                                                                    │    │
│   │  Step 4: 结果重排序（Reranking）                                    │    │
│   │  ────────────────────────────────────────────────────────────────  │    │
│   │  用更精细的模型对 Top-K 结果重新排序                                 │    │
│   │                                                                    │    │
│   │  作用：提高精度，向量召回的可能不是最相关的排在最前面                 │    │
│   │  组件：Reranker（TopK/Cohere/Infinity 等，可选）                    │    │
│   │                                                                    │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                          │                                                   │
│                          ▼                                                   │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │                                                                    │    │
│   │  Step 5: 返回结果                                                   │    │
│   │  ────────────────────────────────────────────────────────────────  │    │
│   │  返回 SearchResult，包含最相关的文档及其相似度分数                   │    │
│   │                                                                    │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2.4.2 核心代码实现

「Retriever.Retrieve() - 检索器的核心逻辑」：

// Retrieve 执行完整的 RAG 检索流程
func (dr *DefaultRetriever) Retrieve(ctx context.Context, q *Query) (*Result, error) {
    // Step 1: 查询增强（如果配置了 QueryEnhancer）
    finalQuery := q.Text
    if dr.queryEnhancer != nil {
        queryReq := &query.Request{
            Query:     q.Text,
            History:   q.History,    // 可以利用历史对话上下文
            UserID:    q.UserID,
            SessionID: q.SessionID,
        }
        enhanced, err := dr.queryEnhancer.EnhanceQuery(ctx, queryReq)
        if err != nil {
            returnnil, err
        }
        finalQuery = enhanced.Enhanced  // 使用增强后的查询
    }

    // Step 2: 查询向量化
    var embedding []float64
    if dr.embedder != nil && finalQuery != "" {
        var err error
        embedding, err = dr.embedder.GetEmbedding(ctx, finalQuery)
        if err != nil {
            returnnil, err
        }
    }

    // Step 3: 向量检索
    searchResults, err := dr.vectorStore.Search(ctx, &vectorstore.SearchQuery{
        Query:      finalQuery,       // 原始查询文本（用于混合搜索）
        Vector:     embedding,        // 查询向量
        Limit:      q.Limit,          // 返回数量限制
        MinScore:   q.MinScore,       // 最小相似度阈值
        Filter:     convertFilter(q.Filter),  // 过滤条件
        SearchMode: q.SearchMode,     // 搜索模式
    })
    if err != nil {
        returnnil, err
    }

    // Step 4: 结果重排序（如果配置了 Reranker）
    rerankerResults := convertToRerankerFormat(searchResults)
    if dr.reranker != nil {
        rerankerResults, err = dr.reranker.Rerank(ctx, &reranker.Query{
            Text:       q.Text,       // 原始查询
            FinalQuery: finalQuery,   // 增强后的查询
            History:    q.History,
        }, rerankerResults)
        if err != nil {
            returnnil, err
        }
    }

    // Step 5: 返回结果
    return &Result{
        Documents: convertToFinalFormat(rerankerResults),
    }, nil
}

2.4.3 搜索模式详解

框架支持四种搜索模式，适用于不同场景：

┌──────────────────────────────────────────────────────────────────────────────┐
│                    四种搜索模式对比                                             │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   SearchModeVector（向量搜索）                                                 │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  原理：计算查询向量与文档向量的余弦相似度                             │    │
│   │  优点：语义理解强，"年假" 能匹配到 "休假申请"                         │    │
│   │  缺点：对精确关键词不敏感                                            │    │
│   │  适用：大多数语义搜索场景                                            │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   SearchModeKeyword（关键词搜索）                                              │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  原理：基于 BM25 等算法的传统关键词匹配                               │    │
│   │  优点：对精确关键词敏感，搜 "API-V2.1" 能精确匹配                     │    │
│   │  缺点：语义理解弱，"年假" 无法匹配 "休假"                             │    │
│   │  适用：技术文档、代码搜索等需要精确匹配的场景                          │    │
│   │  要求：向量库需支持全文索引（如 ES、开启 TSVector 的 pgvector）        │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   SearchModeHybrid（混合搜索）                                                 │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  原理：同时执行向量搜索和关键词搜索，融合两者结果                      │    │
│   │  优点：兼顾语义理解和关键词精确性                                     │    │
│   │  缺点：计算开销较大                                                  │    │
│   │  适用：对召回率和精确度都有要求的场景                                 │    │
│   │  要求：向量库需同时支持向量搜索和全文搜索                             │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   SearchModeFilter（过滤搜索）                                                 │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  原理：只根据元数据过滤，不做相似度计算                               │    │
│   │  优点：性能最高                                                      │    │
│   │  缺点：无法按相关性排序                                              │    │
│   │  适用：已知文档 ID 或按标签筛选的场景                                 │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

「使用示例」：

// 向量搜索（默认）
result, err := kb.Search(ctx, &knowledge.SearchRequest{
    Query:      "如何请年假",
    MaxResults: 5,
    SearchMode: vectorstore.SearchModeVector,
})

// 混合搜索
result, err := kb.Search(ctx, &knowledge.SearchRequest{
    Query:      "API-V2.1 调用示例",
    MaxResults: 5,
    SearchMode: vectorstore.SearchModeHybrid,
})

// 带过滤条件的搜索
result, err := kb.Search(ctx, &knowledge.SearchRequest{
    Query:      "部署流程",
    MaxResults: 5,
    SearchFilter: &knowledge.SearchFilter{
        Metadata: map[string]any{
            "doc_type": "运维手册",
            "version":  "v2",
        },
    },
})

2.4.4 查询增强器（Query Enhancer）

查询增强器用于在检索前优化用户的原始查询，提高召回率。

┌──────────────────────────────────────────────────────────────────────────────┐
│                    查询增强的作用                                               │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   问题：用户查询往往很简短或口语化，难以召回最相关的文档                          │
│                                                                              │
│   示例：                                                                       │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │                                                                    │    │
│   │   用户查询："年假怎么请"                                            │    │
│   │                                                                    │    │
│   │   直接检索可能问题：                                                 │    │
│   │   • 文档中用的是 "休假申请"、"请假流程" 等正式表述                    │    │
│   │   • 向量相似度可能不够高，召回不理想                                  │    │
│   │                                                                    │    │
│   │   增强后查询："2024年 公司 员工 年假 休假 申请 流程 请假 系统 规定"  │    │
│   │                                                                    │    │
│   │   增强效果：                                                         │    │
│   │   • 扩展了同义词和相关词                                            │    │
│   │   • 向量覆盖更广的语义空间                                          │    │
│   │   • 召回率显著提升                                                  │    │
│   │                                                                    │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

「框架内置的查询增强器」：

增强器	说明	适用场景
「PassthroughEnhancer」	不做任何处理，直接返回原查询	默认，简单场景
「LLM-based Enhancer」	用 LLM 扩展查询（需自己实现）	高质量需求场景

「自定义查询增强器示例」：

// 实现 Enhancer 接口
type LLMQueryEnhancer struct {
    llm model.ChatModel
}

func (e *LLMQueryEnhancer) EnhanceQuery(ctx context.Context, req *query.Request) (*query.Enhanced, error) {
    // 构建 prompt，让 LLM 扩展查询
    prompt := fmt.Sprintf(`请将以下用户查询扩展为更完整的搜索查询，添加同义词和相关术语：
用户查询：%s
扩展查询：`, req.Query)
    
    resp, err := e.llm.Generate(ctx, prompt)
    if err != nil {
        returnnil, err
    }
    
    return &query.Enhanced{
        Enhanced: resp.Content,
        Keywords: extractKeywords(resp.Content),
    }, nil
}

// 使用自定义增强器
kb := knowledge.New(
    knowledge.WithQueryEnhancer(&LLMQueryEnhancer{llm: myLLM}),
    // ... 其他配置
)

2.4.5 结果重排序器（Reranker）

向量检索的 Top-K 结果不一定是最相关的，Reranker 用更精细的模型重新排序。

┌──────────────────────────────────────────────────────────────────────────────┐
│                    Reranker 的作用                                            │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   向量检索后的 Top-10 结果（按向量相似度排序）：                                  │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  1. 员工福利政策总览.md           相似度: 0.89                       │    │
│   │  2. 2024年休假管理办法.md         相似度: 0.87  ← 实际最相关         │    │
│   │  3. 公司规章制度汇编.md           相似度: 0.85                       │    │
│   │  4. HR系统使用指南.md             相似度: 0.82                       │    │
│   │  5. 请假申请流程说明.md           相似度: 0.80  ← 也很相关          │    │
│   │  ...                                                                │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   Rerank 后的结果（按精细模型重新排序）：                                        │
│   ┌────────────────────────────────────────────────────────────────────┐    │
│   │  1. 2024年休假管理办法.md         得分: 0.95  ← 提升到第一           │    │
│   │  2. 请假申请流程说明.md           得分: 0.92  ← 提升到第二           │    │
│   │  3. 员工福利政策总览.md           得分: 0.78                         │    │
│   │  4. HR系统使用指南.md             得分: 0.65                         │    │
│   │  5. 公司规章制度汇编.md           得分: 0.52                         │    │
│   │  ...                                                                │    │
│   └────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   原理：                                                                       │
│   • 向量检索是"召回"阶段，追求覆盖面                                           │
│   • Reranker 是"精排"阶段，追求精确度                                         │
│   • Reranker 通常用 Cross-Encoder 模型，同时输入 query 和 document 计算得分    │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

「框架内置的 Reranker」：

Reranker	说明	特点
「TopK」	简单截取 Top-K，不重新排序	默认，无额外开销
「Cohere」	使用 Cohere Rerank API	高精度，需要 API Key
「Infinity」	使用 Infinity Rerank 服务	可自部署

「配置 Reranker 示例」：

import "trpc.group/trpc-go/trpc-agent-go/knowledge/reranker/cohere"

// 使用 Cohere Reranker
cohereReranker := cohere.New(
    cohere.WithAPIKey("your-api-key"),
    cohere.WithModel("rerank-multilingual-v3.0"),
    cohere.WithTopN(5),  // 重排后返回 Top 5
)

kb := knowledge.New(
    knowledge.WithReranker(cohereReranker),
    // ... 其他配置
)

2.5 知识库与会话的集成

2.5.1 集成方式概览

┌──────────────────────────────────────────────────────────────────────────────┐
│                    知识库与 Agent 的集成方式                                    │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   方式 1：作为 Tool 集成（推荐）                                                │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                                                                     │   │
│   │   知识库 ──► 封装为 Tool ──► 注册到 Agent ──► LLM 决定何时调用        │   │
│   │                                                                     │   │
│   │   优点：                                                             │   │
│   │   • LLM 自主判断是否需要检索                                         │   │
│   │   • 支持多知识库，LLM 选择合适的                                     │   │
│   │   • 与其他 Tool 统一管理                                            │   │
│   │                                                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│   方式 2：自动注入上下文                                                       │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                                                                     │   │
│   │   用户问题 ──► 自动检索 ──► 检索结果注入 System Prompt ──► LLM 回答  │   │
│   │                                                                     │   │
│   │   优点：                                                             │   │
│   │   • 每次都检索，不遗漏                                               │   │
│   │   适用：知识库范围明确，所有问题都需要参考                            │   │
│   │                                                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2.5.2 方式一：作为 Tool 集成

// 创建知识搜索 Tool
searchTool := tool.NewKnowledgeSearchTool(
    kb,  // 知识库实例
    tool.WithToolName("search_company_docs"),
    tool.WithToolDescription("搜索公司内部文档，包括政策、规范、技术文档等"),
    tool.WithMaxResults(5),
    tool.WithMinScore(0.7),
)

// 注册到 Agent
agent := llmagent.New(
    llmagent.WithModel(model),
    llmagent.WithTools(searchTool),  // 注册知识搜索 Tool
)

「Tool 定义结构」：

// KnowledgeSearchRequest Tool 的输入参数
type KnowledgeSearchRequest struct {
    Query string `json:"query" jsonschema:"description=搜索查询语句"`
}

// KnowledgeSearchResponse Tool 的输出
type KnowledgeSearchResponse struct {
    Documents []*DocumentResult `json:"documents"`
    Message   string            `json:"message,omitempty"`
}

「LLM 看到的 Tool 定义」：

{
  "name": "search_company_docs",
"description": "搜索公司内部文档，包括政策、规范、技术文档等",
"parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "搜索查询语句"
      }
    },
    "required": ["query"]
  }
}

2.5.3 多知识库管理

┌──────────────────────────────────────────────────────────────────────────────┐
│                    多知识库场景                                                │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   场景：企业有多个知识库，需要 LLM 选择合适的                                   │
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                                                                     │   │
│   │   知识库 1：公司政策文档                                             │   │
│   │   知识库 2：技术 API 文档                                           │   │
│   │   知识库 3：产品使用手册                                            │   │
│   │                                                                     │   │
│   │   每个知识库封装为独立 Tool：                                        │   │
│   │   • search_policy_docs   - "搜索公司政策、HR相关文档"                │   │
│   │   • search_api_docs      - "搜索技术API文档、开发规范"               │   │
│   │   • search_product_docs  - "搜索产品使用手册、FAQ"                   │   │
│   │                                                                     │   │
│   │   LLM 根据用户问题选择：                                             │   │
│   │   • "年假怎么请？" → 调用 search_policy_docs                        │   │
│   │   • "getUserInfo 接口参数？" → 调用 search_api_docs                 │   │
│   │   • "如何重置密码？" → 调用 search_product_docs                     │   │
│   │                                                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

// 多知识库配置示例
policyKB := knowledge.New(/* 政策文档配置 */)
apiKB := knowledge.New(/* API 文档配置 */)
productKB := knowledge.New(/* 产品手册配置 */)

// 创建多个 Tool
policyTool := tool.NewKnowledgeSearchTool(policyKB,
    tool.WithToolName("search_policy_docs"),
    tool.WithToolDescription("搜索公司政策、HR相关、行政规定等文档"),
)

apiTool := tool.NewKnowledgeSearchTool(apiKB,
    tool.WithToolName("search_api_docs"),
    tool.WithToolDescription("搜索技术API文档、开发规范、架构设计等"),
)

productTool := tool.NewKnowledgeSearchTool(productKB,
    tool.WithToolName("search_product_docs"),
    tool.WithToolDescription("搜索产品使用手册、操作指南、常见问题等"),
)

// 全部注册到 Agent
agent := llmagent.New(
    llmagent.WithModel(model),
    llmagent.WithTools(policyTool, apiTool, productTool),
)

2.5.4 动态过滤：AgenticFilterSearchTool

框架提供了更高级的 AgenticFilterSearchTool，让 LLM 可以动态指定过滤条件：

// 创建支持动态过滤的搜索 Tool
filterTool := tool.NewAgenticFilterSearchTool(
    kb,
    tool.WithToolName("search_docs"),
    tool.WithToolDescription("搜索文档，支持按类型、日期等过滤"),
    tool.WithAgenticFilterableFields([]tool.FilterableField{
        {Name: "doc_type", Description: "文档类型：policy/api/product"},
        {Name: "create_date", Description: "创建日期，格式 YYYY-MM-DD"},
    }),
)

「LLM 看到的 Tool 定义」：

{
  "name": "search_docs",
"description": "搜索文档，支持按类型、日期等过滤",
"parameters": {
    "type": "object",
    "properties": {
      "query": { "type": "string", "description": "搜索查询语句" },
      "filters": {
        "type": "object",
        "properties": {
          "doc_type": { "type": "string", "description": "文档类型：policy/api/product" },
          "create_date": { "type": "string", "description": "创建日期，格式 YYYY-MM-DD" }
        }
      }
    }
  }
}

「LLM 调用示例」：

{
  "name": "search_docs",
  "arguments": {
    "query": "请假流程",
    "filters": {
      "doc_type": "policy",
      "create_date": "2024-01-01"
    }
  }
}

2.5.5 运行时过滤

除了 LLM 动态过滤，还支持在 Agent 运行时注入过滤条件：

// 在调用时注入过滤条件
resp, err := agent.Run(ctx, "请假流程是什么？",
    agent.WithKnowledgeFilter(map[string]any{
        "department": "engineering",  // 只搜索工程部门的文档
    }),
)

「过滤条件的优先级和合并」：

┌──────────────────────────────────────────────────────────────────────────────┐
│                    过滤条件合并逻辑                                            │
├──────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   最终过滤 = Tool静态过滤 AND Tool动态过滤 AND 运行时过滤                       │
│                                                                              │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                                                                     │   │
│   │   1. Tool 静态过滤（WithFilter）                                    │   │
│   │      → 创建 Tool 时固定的过滤条件                                    │   │
│   │      → 例如：只搜索 status=active 的文档                            │   │
│   │                                                                     │   │
│   │   2. Tool 动态过滤（AgenticFilterSearchTool）                        │   │
│   │      → LLM 调用时指定的过滤条件                                      │   │
│   │      → 例如：doc_type=policy                                        │   │
│   │                                                                     │   │
│   │   3. 运行时过滤（WithKnowledgeFilter）                               │   │
│   │      → Agent.Run() 时注入的过滤条件                                  │   │
│   │      → 例如：department=engineering                                 │   │
│   │                                                                     │   │
│   │   三者用 AND 合并                                                    │   │
│   │                                                                     │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

2.6 最佳实践

2.6.1 分块策略选择

文档类型	推荐分块策略	参数建议
通用文本	RecursiveChunking	chunk_size=500, overlap=50
Markdown	MarkdownChunking	按标题层级切分
代码文件	RecursiveChunking	chunk_size=1000, 按函数分割
JSON/结构化	JSONChunking	按对象切分

2.6.2 向量存储选择

场景	推荐存储	原因
开发测试	InMemory	快速启动，无需依赖
小规模生产	Qdrant	轻量级，易部署
大规模生产	Milvus	高性能，支持分布式
已有 PG	pgvector	复用现有基础设施
需要全文检索	Elasticsearch	混合检索能力

2.6.3 知识库设计建议

「按领域划分知识库」：不同领域的文档放入不同知识库，提高检索精度
「合理设置 MinScore」：根据业务需求调整阈值，避免返回不相关结果
「启用增量同步」：生产环境建议启用，减少重复计算
「使用 Reranker」：对检索结果进行二次排序，提高准确性
「添加元数据」：为文档添加丰富的元数据，支持精确过滤

3. 总结

配置 Source + Embedder + VectorStore，调用 Load()

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2026-02-14，如有侵权请联系 cloudcommunity@tencent.com 删除

搜索

本文分享自有文化的技术人微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！