312 lines
11 KiB
Markdown
312 lines
11 KiB
Markdown
|
|
# 学习信息收集 总设计
|
|||
|
|
|
|||
|
|
## 1. 概述
|
|||
|
|
|
|||
|
|
M8 里程碑实现从 iOS 客户端(via Rust document runtime)→ API 服务端的学习行为信息收集闭环。
|
|||
|
|
|
|||
|
|
### 数据流
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
iOS App → Rust zx_document_core (ReadingEventV2)
|
|||
|
|
→ iOS 适配层(补充 readingTargetType/platform/appVersion/timezone)
|
|||
|
|
→ POST /reading/events (批量上报)
|
|||
|
|
→ ReadingEventProcessorService(校验/去重/聚合)
|
|||
|
|
→ LearningSession / MaterialReadingProgress / DailyLearningActivity / LearningRecord
|
|||
|
|
→ 查询接口(进度/继续学习/summary/trend/heatmap/历史)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 2. readingTargetType
|
|||
|
|
|
|||
|
|
Rust 侧不存储 `readingTargetType`,由 iOS 适配层在上传时补充。
|
|||
|
|
|
|||
|
|
| readingTargetType | materialId 映射 | knowledgeBaseId |
|
|||
|
|
|---|---|---|
|
|||
|
|
| `knowledge_source` | `KnowledgeSource.id` | `KnowledgeSource.knowledgeBaseId` |
|
|||
|
|
| `temporary_file` | `TemporaryReadingMaterial.id` | `null`(后续可补) |
|
|||
|
|
|
|||
|
|
### iOS 上传时补充逻辑
|
|||
|
|
|
|||
|
|
```typescript
|
|||
|
|
// iOS 适配层在构造上传请求时:
|
|||
|
|
const item = {
|
|||
|
|
eventId: rustEvent.eventId,
|
|||
|
|
clientSessionId: rustEvent.clientSessionId,
|
|||
|
|
materialId: rustEvent.materialId,
|
|||
|
|
eventType: rustEvent.eventType,
|
|||
|
|
position: rustEvent.position,
|
|||
|
|
activeSecondsDelta: rustEvent.activeSecondsDelta,
|
|||
|
|
clientTimestampMs: rustEvent.timestampMs,
|
|||
|
|
sequence: rustEvent.sequence,
|
|||
|
|
// iOS 补充字段:
|
|||
|
|
readingTargetType: resolveTargetType(rustEvent.materialId), // 'knowledge_source' | 'temporary_file'
|
|||
|
|
platform: 'ios',
|
|||
|
|
appVersion: getAppVersion(),
|
|||
|
|
clientTimezoneOffsetMinutes: getTimezoneOffset(),
|
|||
|
|
};
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 3. 实体映射
|
|||
|
|
|
|||
|
|
### 3.1 新增表
|
|||
|
|
|
|||
|
|
#### ReadingEvent(原始事件日志)
|
|||
|
|
|
|||
|
|
```prisma
|
|||
|
|
model ReadingEvent {
|
|||
|
|
id String @id @default(cuid())
|
|||
|
|
userId String
|
|||
|
|
eventId String
|
|||
|
|
clientSessionId String
|
|||
|
|
readingTargetType String @db.VarChar(32)
|
|||
|
|
materialId String
|
|||
|
|
knowledgeBaseId String?
|
|||
|
|
eventType String @db.VarChar(32)
|
|||
|
|
position Json?
|
|||
|
|
activeSecondsDelta Int @default(0)
|
|||
|
|
clientTimestampMs BigInt
|
|||
|
|
clientTimezoneOffsetMinutes Int?
|
|||
|
|
sequence Int
|
|||
|
|
platform String? @db.VarChar(16)
|
|||
|
|
appVersion String? @db.VarChar(32)
|
|||
|
|
status String @default("pending") @db.VarChar(32)
|
|||
|
|
errorCode String? @db.VarChar(32)
|
|||
|
|
warningCodes Json?
|
|||
|
|
serverReceivedAt DateTime @default(now())
|
|||
|
|
processedAt DateTime?
|
|||
|
|
createdAt DateTime @default(now())
|
|||
|
|
|
|||
|
|
user User @relation(fields: [userId], references: [id])
|
|||
|
|
|
|||
|
|
@@unique([userId, eventId])
|
|||
|
|
@@index([userId, clientSessionId])
|
|||
|
|
@@index([userId, readingTargetType, materialId, clientTimestampMs])
|
|||
|
|
@@index([status, createdAt])
|
|||
|
|
@@index([userId, createdAt])
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### MaterialReadingProgress(资料阅读进度)
|
|||
|
|
|
|||
|
|
```prisma
|
|||
|
|
model MaterialReadingProgress {
|
|||
|
|
id String @id @default(cuid())
|
|||
|
|
userId String
|
|||
|
|
materialId String // 关联的 materialId
|
|||
|
|
readingTargetType String @db.VarChar(32)
|
|||
|
|
knowledgeBaseId String? // 从 KnowledgeSource 反查
|
|||
|
|
lastClientSessionId String?
|
|||
|
|
lastPosition Json? // camelCase ReadingPosition
|
|||
|
|
lastProgress Float? // 0~1 归一化进度值
|
|||
|
|
totalActiveSeconds Int @default(0) // 累计活跃阅读秒数
|
|||
|
|
sessionCount Int @default(0) // 阅读会话次数
|
|||
|
|
status String @default("not_started") @db.VarChar(32)
|
|||
|
|
firstOpenedAt DateTime?
|
|||
|
|
lastOpenedAt DateTime?
|
|||
|
|
lastReadAt DateTime?
|
|||
|
|
isMarkedRead Boolean @default(false)
|
|||
|
|
markedReadAt DateTime?
|
|||
|
|
createdAt DateTime @default(now())
|
|||
|
|
updatedAt DateTime @updatedAt
|
|||
|
|
|
|||
|
|
user User @relation(fields: [userId], references: [id])
|
|||
|
|
|
|||
|
|
@@unique([userId, materialId])
|
|||
|
|
@@index([userId])
|
|||
|
|
@@index([knowledgeBaseId])
|
|||
|
|
@@index([status])
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### TemporaryReadingMaterial(临时阅读资料)
|
|||
|
|
|
|||
|
|
```prisma
|
|||
|
|
model TemporaryReadingMaterial {
|
|||
|
|
id String @id @default(cuid())
|
|||
|
|
userId String
|
|||
|
|
title String? @db.VarChar(255)
|
|||
|
|
originalFilename String? @db.VarChar(255)
|
|||
|
|
mimeType String? @db.VarChar(100)
|
|||
|
|
sizeBytes BigInt @default(0)
|
|||
|
|
storageKey String? @db.VarChar(500)
|
|||
|
|
sourceStatus String @default("active") @db.VarChar(32)
|
|||
|
|
expiresAt DateTime?
|
|||
|
|
deletedAt DateTime?
|
|||
|
|
createdAt DateTime @default(now())
|
|||
|
|
updatedAt DateTime @updatedAt
|
|||
|
|
|
|||
|
|
user User @relation(fields: [userId], references: [id])
|
|||
|
|
|
|||
|
|
@@index([userId])
|
|||
|
|
@@index([expiresAt])
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3.2 扩展现有表
|
|||
|
|
|
|||
|
|
#### LearningSession(扩展字段)
|
|||
|
|
|
|||
|
|
在现有 `LearningSession` 基础上新增:
|
|||
|
|
|
|||
|
|
```prisma
|
|||
|
|
model LearningSession {
|
|||
|
|
// ... 现有字段 ...
|
|||
|
|
|
|||
|
|
// M8 新增字段:
|
|||
|
|
clientSessionId String? // Rust client_session_id(关联上报事件)
|
|||
|
|
materialId String? // 正在阅读的资料 materialId
|
|||
|
|
readingTargetType String? @db.VarChar(32)
|
|||
|
|
totalActiveSeconds Int @default(0) // 来自 Rust 的累计活跃秒数
|
|||
|
|
lastPosition Json? // 最后阅读位置
|
|||
|
|
lastEventAt DateTime? // 最后事件时间
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
> 现有字段 `mode` 保留,新增 `readingTargetType` 不冲突。`durationSeconds` 兼容:优先使用 `totalActiveSeconds`(Rust tracker),无 Rust 数据则保留旧逻辑。
|
|||
|
|
|
|||
|
|
#### DailyLearningActivity(扩展字段)
|
|||
|
|
|
|||
|
|
```prisma
|
|||
|
|
model DailyLearningActivity {
|
|||
|
|
// ... 现有字段 (durationSeconds, sessionsCount, activeRecallCount, reviewCount, aiAnalysisCount, completedLoopCount, activityLevel) ...
|
|||
|
|
|
|||
|
|
// M8 新增字段:
|
|||
|
|
readingSeconds Int @default(0) // 当日阅读时长(秒)
|
|||
|
|
materialsReadCount Int @default(0) // 当日阅读资料数
|
|||
|
|
markedReadCount Int @default(0) // 当日标记已读数
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3.3 复用现有表
|
|||
|
|
|
|||
|
|
#### LearningRecord(无需改 schema)
|
|||
|
|
|
|||
|
|
`recordType` 取值扩展:
|
|||
|
|
- `reading` — 阅读记录(新增)
|
|||
|
|
- `read_completed` — 完成阅读(新增)
|
|||
|
|
|
|||
|
|
`metadata` JSON 扩展字段:
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"materialId": "...",
|
|||
|
|
"readingTargetType": "knowledge_source",
|
|||
|
|
"knowledgeBaseId": "...",
|
|||
|
|
"totalActiveSeconds": 120,
|
|||
|
|
"lastPosition": {...}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 4. 核心聚合链路
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
POST /reading/events (批量上报)
|
|||
|
|
│
|
|||
|
|
▼
|
|||
|
|
ReadingEventProcessorService.processBatch(events)
|
|||
|
|
│
|
|||
|
|
├─ 1. 幂等去重(eventId unique)
|
|||
|
|
├─ 2. 校验(activeSecondsDelta >= 0 且 <= 300)
|
|||
|
|
├─ 3. 写入 ReadingEvent 表(status=pending→processed)
|
|||
|
|
│
|
|||
|
|
├─ 4. 聚合 → LearningSession
|
|||
|
|
│ - 按 clientSessionId 找已存在 session
|
|||
|
|
│ - 存在:更新 lastPosition / totalActiveSeconds / lastEventAt
|
|||
|
|
│ - 不存在(MaterialOpened):新建 LearningSession
|
|||
|
|
│ - MaterialClosed:结束 session(status=ended)
|
|||
|
|
│
|
|||
|
|
├─ 5. 聚合 → MaterialReadingProgress
|
|||
|
|
│ - UPSERT (userId, materialId)
|
|||
|
|
│ - 累加 totalActiveSeconds / sessionCount
|
|||
|
|
│ - 更新 latestPosition / progressValue
|
|||
|
|
│ - 时间更新:firstOpenedAt / lastReadAt / completedAt
|
|||
|
|
│
|
|||
|
|
├─ 6. 聚合 → DailyLearningActivity
|
|||
|
|
│ - UPSERT (userId, activityDate)
|
|||
|
|
│ - 累加 readingDurationSeconds / materialCount
|
|||
|
|
│
|
|||
|
|
└─ 7. 写入 LearningRecord(当 MarkedAsRead / MaterialClosed / 首次打开)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 聚合时机
|
|||
|
|
|
|||
|
|
**同步聚合**(在请求处理中完成):
|
|||
|
|
- 校验通过后立即写入 ReadingEvent
|
|||
|
|
- 立即聚合到 LearningSession / MaterialReadingProgress / DailyLearningActivity
|
|||
|
|
- 暂不使用 worker/队列
|
|||
|
|
|
|||
|
|
### 特殊情况处理
|
|||
|
|
|
|||
|
|
| 场景 | 处理 |
|
|||
|
|
|------|------|
|
|||
|
|
| 重复 eventId | status=duplicate, 跳过聚合 |
|
|||
|
|
| activeSecondsDelta < 0 | status=failed, errorCode=INVALID_DELTA |
|
|||
|
|
| activeSecondsDelta > 300 | 截断为 300(单次 tick 不超过 5 分钟) |
|
|||
|
|
| activeSecondsDelta = 0 | 合法(MaterialOpened/PositionChanged/MarkedAsRead) |
|
|||
|
|
| MaterialClosed 无 position | 不覆盖已有 position |
|
|||
|
|
| 乱序事件(时间倒退) | 不拒绝,正常处理(客户端时钟漂移容忍) |
|
|||
|
|
|
|||
|
|
## 5. 错误码与警告码
|
|||
|
|
|
|||
|
|
### 错误码(事件被拒绝,status=failed)
|
|||
|
|
|
|||
|
|
| 码 | 含义 |
|
|||
|
|
|----|------|
|
|||
|
|
| `MATERIAL_NOT_FOUND` | knowledge_source 不存在 |
|
|||
|
|
| `TEMPORARY_MATERIAL_NOT_FOUND` | temporary_file 不存在 |
|
|||
|
|
| `MATERIAL_ACCESS_DENIED` | 不属于当前用户 |
|
|||
|
|
| `TEMPORARY_MATERIAL_EXPIRED` | 临时文件已过期 |
|
|||
|
|
| `INVALID_TARGET_TYPE` | 未知 readingTargetType |
|
|||
|
|
| `INVALID_EVENT_TYPE` | 未知 eventType |
|
|||
|
|
| `INVALID_TIMESTAMP` | 时间戳格式错误 |
|
|||
|
|
| `INVALID_POSITION` | position JSON 格式错误 |
|
|||
|
|
| `INVALID_ACTIVE_SECONDS` | activeSecondsDelta < 0 |
|
|||
|
|
| `BATCH_LIMIT_EXCEEDED` | 超过批量上限(100) |
|
|||
|
|
| `MISSING_CLIENT_SESSION` | 缺少 clientSessionId |
|
|||
|
|
| `MISSING_MATERIAL_ID` | 缺少 materialId |
|
|||
|
|
|
|||
|
|
### 警告码(事件被接受但标记)
|
|||
|
|
|
|||
|
|
| 码 | 含义 |
|
|||
|
|
|----|------|
|
|||
|
|
| `ACTIVE_SECONDS_CAPPED` | delta > 300,已截断 |
|
|||
|
|
| `CLIENT_TIMESTAMP_SKEWED` | 时钟偏差 > 5 min |
|
|||
|
|
| `POSITION_IGNORED` | position 存在但对 eventType 无效 |
|
|||
|
|
| `DUPLICATE_EVENT` | 幂等重放 |
|
|||
|
|
| `OUT_OF_ORDER_EVENT` | 乱序事件 |
|
|||
|
|
| `SOURCE_DELETED` | 来源资料已删除 |
|
|||
|
|
|
|||
|
|
## 6. 权限校验
|
|||
|
|
|
|||
|
|
### 上报接口
|
|||
|
|
- `readingTargetType=knowledge_source`:验证 `KnowledgeSource` 存在且属于当前用户
|
|||
|
|
- `readingTargetType=temporary_file`:验证 `TemporaryReadingMaterial` 存在且属于当前用户
|
|||
|
|
- 未知 materialId:记录 warning,仍接受事件(避免丢失数据)
|
|||
|
|
|
|||
|
|
### 查询接口
|
|||
|
|
- `GET /reading/progress/:materialId`:验证用户权限
|
|||
|
|
- `GET /reading/continue-learning`:返回当前用户的资料
|
|||
|
|
- 所有查询接口通过 JWT guard 获取 userId
|
|||
|
|
|
|||
|
|
## 7. 接口列表
|
|||
|
|
|
|||
|
|
| 方法 | 路径 | 说明 |
|
|||
|
|
|------|------|------|
|
|||
|
|
| POST | `/reading/events` | 批量上报阅读事件 |
|
|||
|
|
| GET | `/reading/progress/:materialId` | 查询单资料阅读进度 |
|
|||
|
|
| GET | `/reading/continue-learning` | 首页继续学习 |
|
|||
|
|
| GET | `/reading/summary` | 学习 summary |
|
|||
|
|
| GET | `/reading/trend` | 纯数据 trend |
|
|||
|
|
| GET | `/reading/heatmap` | 热力图数据 |
|
|||
|
|
| GET | `/reading/history` | 学习历史记录 |
|
|||
|
|
| POST | `/reading/events/replay` | 事件重放/修复 |
|
|||
|
|
|
|||
|
|
## 8. 验收清单
|
|||
|
|
|
|||
|
|
- [x] `docs/learning-info-design.md` 存在
|
|||
|
|
- [x] readingTargetType 定义:knowledge_source / temporary_file
|
|||
|
|
- [x] materialId 映射:→ KnowledgeSource.id / TemporaryReadingMaterial.id
|
|||
|
|
- [x] 权限校验方式:JWT guard + userId + 资源归属检查
|
|||
|
|
- [x] Rust ReadingEventV2 → API ReadingEvent 字段映射
|
|||
|
|
- [x] 核心聚合链路:ReadingEvent → LearningSession → MaterialReadingProgress → DailyLearningActivity → LearningRecord
|
|||
|
|
- [x] 错误码定义:8 种
|
|||
|
|
- [x] 同步聚合策略
|