Some checks failed
Deploy API Server / build-and-deploy (push) Failing after 11s
Phase 1-2: 设计文档 + 数据库 (ReadingEvent/MaterialReadingProgress/TemporaryReadingMaterial/LearningSession扩展/DailyLearningActivity扩展/LearningRecord) Phase 3: 批量上报 + 校验去重 + ReadingEventProcessorService Phase 4: 4表聚合管线 (LearningSession/MaterialReadingProgress/DailyLearningActivity/LearningRecord) Phase 5: 查询接口 (progress/continue/summary/trend/heatmap/history/reprocess) Phase 6: 权限校验 + session中断清理 + API文档 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
11 KiB
11 KiB
学习信息收集 总设计
1. 概述
M8 里程碑实现从 iOS 客户端(via Rust document runtime)→ API 服务端的学习行为信息收集闭环。
数据流
iOS App → Rust zx_document_core (ReadingEventV2)
→ iOS 适配层(补充 readingTargetType/platform/appVersion/timezone)
→ POST /reading/events (批量上报)
→ ReadingEventProcessorService(校验/去重/聚合)
→ LearningSession / MaterialReadingProgress / DailyLearningActivity / LearningRecord
→ 查询接口(进度/继续学习/summary/trend/heatmap/历史)
2. readingTargetType
Rust 侧不存储 readingTargetType,由 iOS 适配层在上传时补充。
| readingTargetType | materialId 映射 | knowledgeBaseId |
|---|---|---|
knowledge_source |
KnowledgeSource.id |
KnowledgeSource.knowledgeBaseId |
temporary_file |
TemporaryReadingMaterial.id |
null(后续可补) |
iOS 上传时补充逻辑
// iOS 适配层在构造上传请求时:
const item = {
eventId: rustEvent.eventId,
clientSessionId: rustEvent.clientSessionId,
materialId: rustEvent.materialId,
eventType: rustEvent.eventType,
position: rustEvent.position,
activeSecondsDelta: rustEvent.activeSecondsDelta,
clientTimestampMs: rustEvent.timestampMs,
sequence: rustEvent.sequence,
// iOS 补充字段:
readingTargetType: resolveTargetType(rustEvent.materialId), // 'knowledge_source' | 'temporary_file'
platform: 'ios',
appVersion: getAppVersion(),
clientTimezoneOffsetMinutes: getTimezoneOffset(),
};
3. 实体映射
3.1 新增表
ReadingEvent(原始事件日志)
model ReadingEvent {
id String @id @default(cuid())
userId String
eventId String
clientSessionId String
readingTargetType String @db.VarChar(32)
materialId String
knowledgeBaseId String?
eventType String @db.VarChar(32)
position Json?
activeSecondsDelta Int @default(0)
clientTimestampMs BigInt
clientTimezoneOffsetMinutes Int?
sequence Int
platform String? @db.VarChar(16)
appVersion String? @db.VarChar(32)
status String @default("pending") @db.VarChar(32)
errorCode String? @db.VarChar(32)
warningCodes Json?
serverReceivedAt DateTime @default(now())
processedAt DateTime?
createdAt DateTime @default(now())
user User @relation(fields: [userId], references: [id])
@@unique([userId, eventId])
@@index([userId, clientSessionId])
@@index([userId, readingTargetType, materialId, clientTimestampMs])
@@index([status, createdAt])
@@index([userId, createdAt])
}
MaterialReadingProgress(资料阅读进度)
model MaterialReadingProgress {
id String @id @default(cuid())
userId String
materialId String // 关联的 materialId
readingTargetType String @db.VarChar(32)
knowledgeBaseId String? // 从 KnowledgeSource 反查
lastClientSessionId String?
lastPosition Json? // camelCase ReadingPosition
lastProgress Float? // 0~1 归一化进度值
totalActiveSeconds Int @default(0) // 累计活跃阅读秒数
sessionCount Int @default(0) // 阅读会话次数
status String @default("not_started") @db.VarChar(32)
firstOpenedAt DateTime?
lastOpenedAt DateTime?
lastReadAt DateTime?
isMarkedRead Boolean @default(false)
markedReadAt DateTime?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
user User @relation(fields: [userId], references: [id])
@@unique([userId, materialId])
@@index([userId])
@@index([knowledgeBaseId])
@@index([status])
}
TemporaryReadingMaterial(临时阅读资料)
model TemporaryReadingMaterial {
id String @id @default(cuid())
userId String
title String? @db.VarChar(255)
originalFilename String? @db.VarChar(255)
mimeType String? @db.VarChar(100)
sizeBytes BigInt @default(0)
storageKey String? @db.VarChar(500)
sourceStatus String @default("active") @db.VarChar(32)
expiresAt DateTime?
deletedAt DateTime?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
user User @relation(fields: [userId], references: [id])
@@index([userId])
@@index([expiresAt])
}
3.2 扩展现有表
LearningSession(扩展字段)
在现有 LearningSession 基础上新增:
model LearningSession {
// ... 现有字段 ...
// M8 新增字段:
clientSessionId String? // Rust client_session_id(关联上报事件)
materialId String? // 正在阅读的资料 materialId
readingTargetType String? @db.VarChar(32)
totalActiveSeconds Int @default(0) // 来自 Rust 的累计活跃秒数
lastPosition Json? // 最后阅读位置
lastEventAt DateTime? // 最后事件时间
}
现有字段
mode保留,新增readingTargetType不冲突。durationSeconds兼容:优先使用totalActiveSeconds(Rust tracker),无 Rust 数据则保留旧逻辑。
DailyLearningActivity(扩展字段)
model DailyLearningActivity {
// ... 现有字段 (durationSeconds, sessionsCount, activeRecallCount, reviewCount, aiAnalysisCount, completedLoopCount, activityLevel) ...
// M8 新增字段:
readingSeconds Int @default(0) // 当日阅读时长(秒)
materialsReadCount Int @default(0) // 当日阅读资料数
markedReadCount Int @default(0) // 当日标记已读数
}
3.3 复用现有表
LearningRecord(无需改 schema)
recordType 取值扩展:
reading— 阅读记录(新增)read_completed— 完成阅读(新增)
metadata JSON 扩展字段:
{
"materialId": "...",
"readingTargetType": "knowledge_source",
"knowledgeBaseId": "...",
"totalActiveSeconds": 120,
"lastPosition": {...}
}
4. 核心聚合链路
POST /reading/events (批量上报)
│
▼
ReadingEventProcessorService.processBatch(events)
│
├─ 1. 幂等去重(eventId unique)
├─ 2. 校验(activeSecondsDelta >= 0 且 <= 300)
├─ 3. 写入 ReadingEvent 表(status=pending→processed)
│
├─ 4. 聚合 → LearningSession
│ - 按 clientSessionId 找已存在 session
│ - 存在:更新 lastPosition / totalActiveSeconds / lastEventAt
│ - 不存在(MaterialOpened):新建 LearningSession
│ - MaterialClosed:结束 session(status=ended)
│
├─ 5. 聚合 → MaterialReadingProgress
│ - UPSERT (userId, materialId)
│ - 累加 totalActiveSeconds / sessionCount
│ - 更新 latestPosition / progressValue
│ - 时间更新:firstOpenedAt / lastReadAt / completedAt
│
├─ 6. 聚合 → DailyLearningActivity
│ - UPSERT (userId, activityDate)
│ - 累加 readingDurationSeconds / materialCount
│
└─ 7. 写入 LearningRecord(当 MarkedAsRead / MaterialClosed / 首次打开)
聚合时机
同步聚合(在请求处理中完成):
- 校验通过后立即写入 ReadingEvent
- 立即聚合到 LearningSession / MaterialReadingProgress / DailyLearningActivity
- 暂不使用 worker/队列
特殊情况处理
| 场景 | 处理 |
|---|---|
| 重复 eventId | status=duplicate, 跳过聚合 |
| activeSecondsDelta < 0 | status=failed, errorCode=INVALID_DELTA |
| activeSecondsDelta > 300 | 截断为 300(单次 tick 不超过 5 分钟) |
| activeSecondsDelta = 0 | 合法(MaterialOpened/PositionChanged/MarkedAsRead) |
| MaterialClosed 无 position | 不覆盖已有 position |
| 乱序事件(时间倒退) | 不拒绝,正常处理(客户端时钟漂移容忍) |
5. 错误码与警告码
错误码(事件被拒绝,status=failed)
| 码 | 含义 |
|---|---|
MATERIAL_NOT_FOUND |
knowledge_source 不存在 |
TEMPORARY_MATERIAL_NOT_FOUND |
temporary_file 不存在 |
MATERIAL_ACCESS_DENIED |
不属于当前用户 |
TEMPORARY_MATERIAL_EXPIRED |
临时文件已过期 |
INVALID_TARGET_TYPE |
未知 readingTargetType |
INVALID_EVENT_TYPE |
未知 eventType |
INVALID_TIMESTAMP |
时间戳格式错误 |
INVALID_POSITION |
position JSON 格式错误 |
INVALID_ACTIVE_SECONDS |
activeSecondsDelta < 0 |
BATCH_LIMIT_EXCEEDED |
超过批量上限(100) |
MISSING_CLIENT_SESSION |
缺少 clientSessionId |
MISSING_MATERIAL_ID |
缺少 materialId |
警告码(事件被接受但标记)
| 码 | 含义 |
|---|---|
ACTIVE_SECONDS_CAPPED |
delta > 300,已截断 |
CLIENT_TIMESTAMP_SKEWED |
时钟偏差 > 5 min |
POSITION_IGNORED |
position 存在但对 eventType 无效 |
DUPLICATE_EVENT |
幂等重放 |
OUT_OF_ORDER_EVENT |
乱序事件 |
SOURCE_DELETED |
来源资料已删除 |
6. 权限校验
上报接口
readingTargetType=knowledge_source:验证KnowledgeSource存在且属于当前用户readingTargetType=temporary_file:验证TemporaryReadingMaterial存在且属于当前用户- 未知 materialId:记录 warning,仍接受事件(避免丢失数据)
查询接口
GET /reading/progress/:materialId:验证用户权限GET /reading/continue-learning:返回当前用户的资料- 所有查询接口通过 JWT guard 获取 userId
7. 接口列表
| 方法 | 路径 | 说明 |
|---|---|---|
| POST | /reading/events |
批量上报阅读事件 |
| GET | /reading/progress/:materialId |
查询单资料阅读进度 |
| GET | /reading/continue-learning |
首页继续学习 |
| GET | /reading/summary |
学习 summary |
| GET | /reading/trend |
纯数据 trend |
| GET | /reading/heatmap |
热力图数据 |
| GET | /reading/history |
学习历史记录 |
| POST | /reading/events/replay |
事件重放/修复 |
8. 验收清单
docs/learning-info-design.md存在- readingTargetType 定义:knowledge_source / temporary_file
- materialId 映射:→ KnowledgeSource.id / TemporaryReadingMaterial.id
- 权限校验方式:JWT guard + userId + 资源归属检查
- Rust ReadingEventV2 → API ReadingEvent 字段映射
- 核心聚合链路:ReadingEvent → LearningSession → MaterialReadingProgress → DailyLearningActivity → LearningRecord
- 错误码定义:8 种
- 同步聚合策略