Some checks failed
Deploy API Server / build-and-deploy (push) Failing after 11s
Phase 1-2: 设计文档 + 数据库 (ReadingEvent/MaterialReadingProgress/TemporaryReadingMaterial/LearningSession扩展/DailyLearningActivity扩展/LearningRecord) Phase 3: 批量上报 + 校验去重 + ReadingEventProcessorService Phase 4: 4表聚合管线 (LearningSession/MaterialReadingProgress/DailyLearningActivity/LearningRecord) Phase 5: 查询接口 (progress/continue/summary/trend/heatmap/history/reprocess) Phase 6: 权限校验 + session中断清理 + API文档 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
312 lines
11 KiB
Markdown
312 lines
11 KiB
Markdown
# 学习信息收集 总设计
|
||
|
||
## 1. 概述
|
||
|
||
M8 里程碑实现从 iOS 客户端(via Rust document runtime)→ API 服务端的学习行为信息收集闭环。
|
||
|
||
### 数据流
|
||
|
||
```
|
||
iOS App → Rust zx_document_core (ReadingEventV2)
|
||
→ iOS 适配层(补充 readingTargetType/platform/appVersion/timezone)
|
||
→ POST /reading/events (批量上报)
|
||
→ ReadingEventProcessorService(校验/去重/聚合)
|
||
→ LearningSession / MaterialReadingProgress / DailyLearningActivity / LearningRecord
|
||
→ 查询接口(进度/继续学习/summary/trend/heatmap/历史)
|
||
```
|
||
|
||
## 2. readingTargetType
|
||
|
||
Rust 侧不存储 `readingTargetType`,由 iOS 适配层在上传时补充。
|
||
|
||
| readingTargetType | materialId 映射 | knowledgeBaseId |
|
||
|---|---|---|
|
||
| `knowledge_source` | `KnowledgeSource.id` | `KnowledgeSource.knowledgeBaseId` |
|
||
| `temporary_file` | `TemporaryReadingMaterial.id` | `null`(后续可补) |
|
||
|
||
### iOS 上传时补充逻辑
|
||
|
||
```typescript
|
||
// iOS 适配层在构造上传请求时:
|
||
const item = {
|
||
eventId: rustEvent.eventId,
|
||
clientSessionId: rustEvent.clientSessionId,
|
||
materialId: rustEvent.materialId,
|
||
eventType: rustEvent.eventType,
|
||
position: rustEvent.position,
|
||
activeSecondsDelta: rustEvent.activeSecondsDelta,
|
||
clientTimestampMs: rustEvent.timestampMs,
|
||
sequence: rustEvent.sequence,
|
||
// iOS 补充字段:
|
||
readingTargetType: resolveTargetType(rustEvent.materialId), // 'knowledge_source' | 'temporary_file'
|
||
platform: 'ios',
|
||
appVersion: getAppVersion(),
|
||
clientTimezoneOffsetMinutes: getTimezoneOffset(),
|
||
};
|
||
```
|
||
|
||
## 3. 实体映射
|
||
|
||
### 3.1 新增表
|
||
|
||
#### ReadingEvent(原始事件日志)
|
||
|
||
```prisma
|
||
model ReadingEvent {
|
||
id String @id @default(cuid())
|
||
userId String
|
||
eventId String
|
||
clientSessionId String
|
||
readingTargetType String @db.VarChar(32)
|
||
materialId String
|
||
knowledgeBaseId String?
|
||
eventType String @db.VarChar(32)
|
||
position Json?
|
||
activeSecondsDelta Int @default(0)
|
||
clientTimestampMs BigInt
|
||
clientTimezoneOffsetMinutes Int?
|
||
sequence Int
|
||
platform String? @db.VarChar(16)
|
||
appVersion String? @db.VarChar(32)
|
||
status String @default("pending") @db.VarChar(32)
|
||
errorCode String? @db.VarChar(32)
|
||
warningCodes Json?
|
||
serverReceivedAt DateTime @default(now())
|
||
processedAt DateTime?
|
||
createdAt DateTime @default(now())
|
||
|
||
user User @relation(fields: [userId], references: [id])
|
||
|
||
@@unique([userId, eventId])
|
||
@@index([userId, clientSessionId])
|
||
@@index([userId, readingTargetType, materialId, clientTimestampMs])
|
||
@@index([status, createdAt])
|
||
@@index([userId, createdAt])
|
||
}
|
||
```
|
||
|
||
#### MaterialReadingProgress(资料阅读进度)
|
||
|
||
```prisma
|
||
model MaterialReadingProgress {
|
||
id String @id @default(cuid())
|
||
userId String
|
||
materialId String // 关联的 materialId
|
||
readingTargetType String @db.VarChar(32)
|
||
knowledgeBaseId String? // 从 KnowledgeSource 反查
|
||
lastClientSessionId String?
|
||
lastPosition Json? // camelCase ReadingPosition
|
||
lastProgress Float? // 0~1 归一化进度值
|
||
totalActiveSeconds Int @default(0) // 累计活跃阅读秒数
|
||
sessionCount Int @default(0) // 阅读会话次数
|
||
status String @default("not_started") @db.VarChar(32)
|
||
firstOpenedAt DateTime?
|
||
lastOpenedAt DateTime?
|
||
lastReadAt DateTime?
|
||
isMarkedRead Boolean @default(false)
|
||
markedReadAt DateTime?
|
||
createdAt DateTime @default(now())
|
||
updatedAt DateTime @updatedAt
|
||
|
||
user User @relation(fields: [userId], references: [id])
|
||
|
||
@@unique([userId, materialId])
|
||
@@index([userId])
|
||
@@index([knowledgeBaseId])
|
||
@@index([status])
|
||
}
|
||
```
|
||
|
||
#### TemporaryReadingMaterial(临时阅读资料)
|
||
|
||
```prisma
|
||
model TemporaryReadingMaterial {
|
||
id String @id @default(cuid())
|
||
userId String
|
||
title String? @db.VarChar(255)
|
||
originalFilename String? @db.VarChar(255)
|
||
mimeType String? @db.VarChar(100)
|
||
sizeBytes BigInt @default(0)
|
||
storageKey String? @db.VarChar(500)
|
||
sourceStatus String @default("active") @db.VarChar(32)
|
||
expiresAt DateTime?
|
||
deletedAt DateTime?
|
||
createdAt DateTime @default(now())
|
||
updatedAt DateTime @updatedAt
|
||
|
||
user User @relation(fields: [userId], references: [id])
|
||
|
||
@@index([userId])
|
||
@@index([expiresAt])
|
||
}
|
||
```
|
||
|
||
### 3.2 扩展现有表
|
||
|
||
#### LearningSession(扩展字段)
|
||
|
||
在现有 `LearningSession` 基础上新增:
|
||
|
||
```prisma
|
||
model LearningSession {
|
||
// ... 现有字段 ...
|
||
|
||
// M8 新增字段:
|
||
clientSessionId String? // Rust client_session_id(关联上报事件)
|
||
materialId String? // 正在阅读的资料 materialId
|
||
readingTargetType String? @db.VarChar(32)
|
||
totalActiveSeconds Int @default(0) // 来自 Rust 的累计活跃秒数
|
||
lastPosition Json? // 最后阅读位置
|
||
lastEventAt DateTime? // 最后事件时间
|
||
}
|
||
```
|
||
|
||
> 现有字段 `mode` 保留,新增 `readingTargetType` 不冲突。`durationSeconds` 兼容:优先使用 `totalActiveSeconds`(Rust tracker),无 Rust 数据则保留旧逻辑。
|
||
|
||
#### DailyLearningActivity(扩展字段)
|
||
|
||
```prisma
|
||
model DailyLearningActivity {
|
||
// ... 现有字段 (durationSeconds, sessionsCount, activeRecallCount, reviewCount, aiAnalysisCount, completedLoopCount, activityLevel) ...
|
||
|
||
// M8 新增字段:
|
||
readingSeconds Int @default(0) // 当日阅读时长(秒)
|
||
materialsReadCount Int @default(0) // 当日阅读资料数
|
||
markedReadCount Int @default(0) // 当日标记已读数
|
||
}
|
||
```
|
||
|
||
### 3.3 复用现有表
|
||
|
||
#### LearningRecord(无需改 schema)
|
||
|
||
`recordType` 取值扩展:
|
||
- `reading` — 阅读记录(新增)
|
||
- `read_completed` — 完成阅读(新增)
|
||
|
||
`metadata` JSON 扩展字段:
|
||
```json
|
||
{
|
||
"materialId": "...",
|
||
"readingTargetType": "knowledge_source",
|
||
"knowledgeBaseId": "...",
|
||
"totalActiveSeconds": 120,
|
||
"lastPosition": {...}
|
||
}
|
||
```
|
||
|
||
## 4. 核心聚合链路
|
||
|
||
```
|
||
POST /reading/events (批量上报)
|
||
│
|
||
▼
|
||
ReadingEventProcessorService.processBatch(events)
|
||
│
|
||
├─ 1. 幂等去重(eventId unique)
|
||
├─ 2. 校验(activeSecondsDelta >= 0 且 <= 300)
|
||
├─ 3. 写入 ReadingEvent 表(status=pending→processed)
|
||
│
|
||
├─ 4. 聚合 → LearningSession
|
||
│ - 按 clientSessionId 找已存在 session
|
||
│ - 存在:更新 lastPosition / totalActiveSeconds / lastEventAt
|
||
│ - 不存在(MaterialOpened):新建 LearningSession
|
||
│ - MaterialClosed:结束 session(status=ended)
|
||
│
|
||
├─ 5. 聚合 → MaterialReadingProgress
|
||
│ - UPSERT (userId, materialId)
|
||
│ - 累加 totalActiveSeconds / sessionCount
|
||
│ - 更新 latestPosition / progressValue
|
||
│ - 时间更新:firstOpenedAt / lastReadAt / completedAt
|
||
│
|
||
├─ 6. 聚合 → DailyLearningActivity
|
||
│ - UPSERT (userId, activityDate)
|
||
│ - 累加 readingDurationSeconds / materialCount
|
||
│
|
||
└─ 7. 写入 LearningRecord(当 MarkedAsRead / MaterialClosed / 首次打开)
|
||
```
|
||
|
||
### 聚合时机
|
||
|
||
**同步聚合**(在请求处理中完成):
|
||
- 校验通过后立即写入 ReadingEvent
|
||
- 立即聚合到 LearningSession / MaterialReadingProgress / DailyLearningActivity
|
||
- 暂不使用 worker/队列
|
||
|
||
### 特殊情况处理
|
||
|
||
| 场景 | 处理 |
|
||
|------|------|
|
||
| 重复 eventId | status=duplicate, 跳过聚合 |
|
||
| activeSecondsDelta < 0 | status=failed, errorCode=INVALID_DELTA |
|
||
| activeSecondsDelta > 300 | 截断为 300(单次 tick 不超过 5 分钟) |
|
||
| activeSecondsDelta = 0 | 合法(MaterialOpened/PositionChanged/MarkedAsRead) |
|
||
| MaterialClosed 无 position | 不覆盖已有 position |
|
||
| 乱序事件(时间倒退) | 不拒绝,正常处理(客户端时钟漂移容忍) |
|
||
|
||
## 5. 错误码与警告码
|
||
|
||
### 错误码(事件被拒绝,status=failed)
|
||
|
||
| 码 | 含义 |
|
||
|----|------|
|
||
| `MATERIAL_NOT_FOUND` | knowledge_source 不存在 |
|
||
| `TEMPORARY_MATERIAL_NOT_FOUND` | temporary_file 不存在 |
|
||
| `MATERIAL_ACCESS_DENIED` | 不属于当前用户 |
|
||
| `TEMPORARY_MATERIAL_EXPIRED` | 临时文件已过期 |
|
||
| `INVALID_TARGET_TYPE` | 未知 readingTargetType |
|
||
| `INVALID_EVENT_TYPE` | 未知 eventType |
|
||
| `INVALID_TIMESTAMP` | 时间戳格式错误 |
|
||
| `INVALID_POSITION` | position JSON 格式错误 |
|
||
| `INVALID_ACTIVE_SECONDS` | activeSecondsDelta < 0 |
|
||
| `BATCH_LIMIT_EXCEEDED` | 超过批量上限(100) |
|
||
| `MISSING_CLIENT_SESSION` | 缺少 clientSessionId |
|
||
| `MISSING_MATERIAL_ID` | 缺少 materialId |
|
||
|
||
### 警告码(事件被接受但标记)
|
||
|
||
| 码 | 含义 |
|
||
|----|------|
|
||
| `ACTIVE_SECONDS_CAPPED` | delta > 300,已截断 |
|
||
| `CLIENT_TIMESTAMP_SKEWED` | 时钟偏差 > 5 min |
|
||
| `POSITION_IGNORED` | position 存在但对 eventType 无效 |
|
||
| `DUPLICATE_EVENT` | 幂等重放 |
|
||
| `OUT_OF_ORDER_EVENT` | 乱序事件 |
|
||
| `SOURCE_DELETED` | 来源资料已删除 |
|
||
|
||
## 6. 权限校验
|
||
|
||
### 上报接口
|
||
- `readingTargetType=knowledge_source`:验证 `KnowledgeSource` 存在且属于当前用户
|
||
- `readingTargetType=temporary_file`:验证 `TemporaryReadingMaterial` 存在且属于当前用户
|
||
- 未知 materialId:记录 warning,仍接受事件(避免丢失数据)
|
||
|
||
### 查询接口
|
||
- `GET /reading/progress/:materialId`:验证用户权限
|
||
- `GET /reading/continue-learning`:返回当前用户的资料
|
||
- 所有查询接口通过 JWT guard 获取 userId
|
||
|
||
## 7. 接口列表
|
||
|
||
| 方法 | 路径 | 说明 |
|
||
|------|------|------|
|
||
| POST | `/reading/events` | 批量上报阅读事件 |
|
||
| GET | `/reading/progress/:materialId` | 查询单资料阅读进度 |
|
||
| GET | `/reading/continue-learning` | 首页继续学习 |
|
||
| GET | `/reading/summary` | 学习 summary |
|
||
| GET | `/reading/trend` | 纯数据 trend |
|
||
| GET | `/reading/heatmap` | 热力图数据 |
|
||
| GET | `/reading/history` | 学习历史记录 |
|
||
| POST | `/reading/events/replay` | 事件重放/修复 |
|
||
|
||
## 8. 验收清单
|
||
|
||
- [x] `docs/learning-info-design.md` 存在
|
||
- [x] readingTargetType 定义:knowledge_source / temporary_file
|
||
- [x] materialId 映射:→ KnowledgeSource.id / TemporaryReadingMaterial.id
|
||
- [x] 权限校验方式:JWT guard + userId + 资源归属检查
|
||
- [x] Rust ReadingEventV2 → API ReadingEvent 字段映射
|
||
- [x] 核心聚合链路:ReadingEvent → LearningSession → MaterialReadingProgress → DailyLearningActivity → LearningRecord
|
||
- [x] 错误码定义:8 种
|
||
- [x] 同步聚合策略
|