M1-02 Vector & Retrieval Module #16

New Issue

wangdl · 2026-05-22T21:03:29+08:00

wangdl commented

2026-05-22 21:03:29 +08:00

目标

设计知习后端向量存储与检索模块，为知识库和 RAG 系统提供 Qdrant 向量数据库的完整访问能力，包括 collection 管理、向量写入/删除、语义检索、rerank 和引用上下文组装。

本 Issue 只做模块架构设计，不直接实现代码。

背景说明

知习的知识库问答（RAG）和候选知识点生成依赖向量检索。用户上传的资料经过 Ingestion 模块解析、切片、embedding 后，需要写入 Qdrant 向量数据库。RAG Chat 查询时需要从 Qdrant 检索相关片段，经过 rerank 后组装成 LLM 上下文。

Vector & Retrieval 模块是全系统唯一的向量数据库访问入口。MySQL 是业务权威库，Qdrant 是索引库——这一原则必须遵守。

模块职责

本模块负责：
- Qdrant collection 创建和管理（按知识库或按类型建立 collection 策略，请判断）
- payload index 配置（支持按知识库 ID、文档 ID、用户 ID 等字段过滤）
- vector upsert（批量写入 embedding 向量）
- vector delete（单个或批量删除，按 source ID 清理）
- 语义检索（基于 embedding 相似度搜索）
- rerank（对检索结果重排序提高精度）
- citation context 组装（从检索结果还原引用上下文，包含来源信息）
- 检索结果缓存策略设计（需谨慎）
本模块不负责：
- Embedding 生成（AI Gateway 或专门的 embedding 服务负责）
- 文档解析和切片（走 Ingestion Module）
- RAG 对话逻辑（走 RAG Chat Module）
- Qdrant 运维（走 Server Monitor 和 Backup Module）

候选数据对象

VectorPoint（Qdrant 中的向量点，对应 MySQL 中的 KnowledgeChunk 或 MaterialSlice）
VectorIndexJob（向量索引任务）
RetrievalResult（检索结果，含相关度和来源）
RerankResult（重排序后结果）

基础设施依赖判断

MySQL：否（向量数据存储在 Qdrant，但需要 MySQL 中的 source 信息做引用还原）
Redis：需判断（检索结果缓存）
BullMQ：是（异步向量写入任务）
Qdrant：是（核心存储依赖）
AI Gateway：是（embedding 生成走 AI Gateway）
COS：否
Config：是（Qdrant 连接配置、检索参数）

API 设计

Internal Provider（供其他模块调用）：
- VectorService.upsert(chunks, embeddings)：批量写入向量
- VectorService.delete(sourceIds)：按来源删除向量
- VectorService.search(queryEmbedding, filters, options)：语义检索
- VectorService.rerank(results, query)：重排序
AAPI：
- Collection 状态查看
- 向量数量统计
- 索引重建触发

Domain Event 设计

VectorsUpserted：向量写入完成
VectorsDeleted：向量删除完成
IndexRebuildStarted/Completed：索引重建

Admin 视图设计

Qdrant 状态页：
- Collection 列表（名称、向量数量、维度、状态）
- 索引状态（是否正在重建）
检索调试页（Knowledge Ops 共享）：
- 输入查询文本，查看检索结果
- 查看 citation context 原始数据

交付检查

路由归属：Internal Provider + AAPI
是否需要 Prisma migration：否（Qdrant 不是 MySQL）
是否需要 MySQL：否（但需要读取 source/chunk 信息）
是否需要 Redis：需判断（检索缓存）
是否需要 BullMQ：是（异步写入）
是否需要 Qdrant：是
是否需要 AI Gateway：是（获取 embedding）
是否需要 Content Safety：否（检索本身不涉及内容安全，但上游调用方应做检测）
是否需要 Cost 记录：是（embedding 调用成本）
是否需要 AuditLog：否
是否需要 Domain Event：是
是否需要 Admin 视图：是
是否需要 E2E/集成测试：是

验收标准

Qdrant collection 策略设计（按知识库 or 全局，含判断理由）
VectorService 接口设计（upsert/delete/search/rerank）
payload index 设计（支持业务过滤字段）
citation context 组装方案
embedding → 检索 → rerank → citation 全链路打通
Admin 状态视图设计
集成测试覆盖写入/检索/删除

禁止事项

禁止业务模块直接调用 Qdrant SDK（必须走 Vector Service）
禁止把 Qdrant 当作业务权威数据库（MySQL 是权威，Qdrant 是索引）
禁止向量检索不设 topK 上限
禁止 RAG 查询缓存不经谨慎设计就上线（可能返回过期/错误结果）
禁止检索结果没有来源引用

不建议当前阶段实现

Hybrid search（稀疏+稠密混合检索）
多向量空间和跨模态检索
向量压缩和量化
检索结果 A/B 测试框架

## 目标设计知习后端向量存储与检索模块，为知识库和 RAG 系统提供 Qdrant 向量数据库的完整访问能力，包括 collection 管理、向量写入/删除、语义检索、rerank 和引用上下文组装。本 Issue 只做模块架构设计，不直接实现代码。 ## 背景说明知习的知识库问答（RAG）和候选知识点生成依赖向量检索。用户上传的资料经过 Ingestion 模块解析、切片、embedding 后，需要写入 Qdrant 向量数据库。RAG Chat 查询时需要从 Qdrant 检索相关片段，经过 rerank 后组装成 LLM 上下文。 Vector & Retrieval 模块是全系统唯一的向量数据库访问入口。MySQL 是业务权威库，Qdrant 是索引库——这一原则必须遵守。 ## 模块职责 1. 本模块负责： - Qdrant collection 创建和管理（按知识库或按类型建立 collection 策略，请判断） - payload index 配置（支持按知识库 ID、文档 ID、用户 ID 等字段过滤） - vector upsert（批量写入 embedding 向量） - vector delete（单个或批量删除，按 source ID 清理） - 语义检索（基于 embedding 相似度搜索） - rerank（对检索结果重排序提高精度） - citation context 组装（从检索结果还原引用上下文，包含来源信息） - 检索结果缓存策略设计（需谨慎） 2. 本模块不负责： - Embedding 生成（AI Gateway 或专门的 embedding 服务负责） - 文档解析和切片（走 Ingestion Module） - RAG 对话逻辑（走 RAG Chat Module） - Qdrant 运维（走 Server Monitor 和 Backup Module） ## 候选数据对象 - VectorPoint（Qdrant 中的向量点，对应 MySQL 中的 KnowledgeChunk 或 MaterialSlice） - VectorIndexJob（向量索引任务） - RetrievalResult（检索结果，含相关度和来源） - RerankResult（重排序后结果） ## 基础设施依赖判断 - MySQL：否（向量数据存储在 Qdrant，但需要 MySQL 中的 source 信息做引用还原） - Redis：需判断（检索结果缓存） - BullMQ：是（异步向量写入任务） - Qdrant：是（核心存储依赖） - AI Gateway：是（embedding 生成走 AI Gateway） - COS：否 - Config：是（Qdrant 连接配置、检索参数） ## API 设计 1. Internal Provider（供其他模块调用）： - VectorService.upsert(chunks, embeddings)：批量写入向量 - VectorService.delete(sourceIds)：按来源删除向量 - VectorService.search(queryEmbedding, filters, options)：语义检索 - VectorService.rerank(results, query)：重排序 2. AAPI： - Collection 状态查看 - 向量数量统计 - 索引重建触发 ## Domain Event 设计 - VectorsUpserted：向量写入完成 - VectorsDeleted：向量删除完成 - IndexRebuildStarted/Completed：索引重建 ## Admin 视图设计 1. Qdrant 状态页： - Collection 列表（名称、向量数量、维度、状态） - 索引状态（是否正在重建） 2. 检索调试页（Knowledge Ops 共享）： - 输入查询文本，查看检索结果 - 查看 citation context 原始数据 ## 交付检查 - [ ] 路由归属：Internal Provider + AAPI - [ ] 是否需要 Prisma migration：否（Qdrant 不是 MySQL） - [ ] 是否需要 MySQL：否（但需要读取 source/chunk 信息） - [ ] 是否需要 Redis：需判断（检索缓存） - [ ] 是否需要 BullMQ：是（异步写入） - [ ] 是否需要 Qdrant：是 - [ ] 是否需要 AI Gateway：是（获取 embedding） - [ ] 是否需要 Content Safety：否（检索本身不涉及内容安全，但上游调用方应做检测） - [ ] 是否需要 Cost 记录：是（embedding 调用成本） - [ ] 是否需要 AuditLog：否 - [ ] 是否需要 Domain Event：是 - [ ] 是否需要 Admin 视图：是 - [ ] 是否需要 E2E/集成测试：是 ## 验收标准 1. Qdrant collection 策略设计（按知识库 or 全局，含判断理由） 2. VectorService 接口设计（upsert/delete/search/rerank） 3. payload index 设计（支持业务过滤字段） 4. citation context 组装方案 5. embedding → 检索 → rerank → citation 全链路打通 6. Admin 状态视图设计 7. 集成测试覆盖写入/检索/删除 ## 禁止事项 - 禁止业务模块直接调用 Qdrant SDK（必须走 Vector Service） - 禁止把 Qdrant 当作业务权威数据库（MySQL 是权威，Qdrant 是索引） - 禁止向量检索不设 topK 上限 - 禁止 RAG 查询缓存不经谨慎设计就上线（可能返回过期/错误结果） - 禁止检索结果没有来源引用 ## 不建议当前阶段实现 - Hybrid search（稀疏+稠密混合检索） - 多向量空间和跨模态检索 - 向量压缩和量化 - 检索结果 A/B 测试框架

wangdl added this to the M1：AI / RAG 运行时与检索底座（P0~P1） milestone 2026-05-22 21:03:29 +08:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: wangdl/api-server#16