Initial commit: skills library
- 70 skills with code and documentation - Add .gitignore (ignore __pycache__, output/, temp/, venv/) - Clean up test intermediates and caches
This commit is contained in:
@@ -0,0 +1,366 @@
|
||||
# Delegation Templates for multimodal-looker
|
||||
|
||||
Ready-to-use prompts for delegating visual analysis tasks.
|
||||
|
||||
## Template Structure
|
||||
|
||||
```markdown
|
||||
@multimodal-looker [分析类型] 这张 [图片类型]
|
||||
|
||||
**图片**: [路径/URL/描述]
|
||||
|
||||
**用户需求**: [原始需求]
|
||||
|
||||
**分析重点**:
|
||||
1. [重点 1]
|
||||
2. [重点 2]
|
||||
3. [重点 3]
|
||||
|
||||
**输出格式**: [期望格式]
|
||||
```
|
||||
|
||||
## Scenario Templates
|
||||
|
||||
### 1. Error Screenshot Analysis
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请分析这个错误日志截图
|
||||
|
||||
**图片**: [图片路径]
|
||||
|
||||
**用户需求**: 诊断错误原因并提供解决方案
|
||||
|
||||
**分析重点**:
|
||||
1. 错误类型和错误代码
|
||||
2. 堆栈跟踪的关键信息
|
||||
3. 出错的文件名和行号
|
||||
4. 任何相关的上下文信息
|
||||
|
||||
**输出格式**:
|
||||
- 错误类型:[类型]
|
||||
- 错误位置:[文件:行号]
|
||||
- 可能原因:[分析]
|
||||
- 建议解决方案:[步骤]
|
||||
```
|
||||
|
||||
### 2. Architecture Diagram Analysis
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请分析这个架构图
|
||||
|
||||
**图片**: [图片路径]
|
||||
|
||||
**用户需求**: [理解系统架构/生成部署方案/识别组件]
|
||||
|
||||
**分析重点**:
|
||||
1. 所有组件/模块的名称和功能
|
||||
2. 组件之间的连接关系和数据流向
|
||||
3. 使用的技术栈标识(如果有)
|
||||
4. 架构模式(微服务、单体、分层等)
|
||||
|
||||
**输出格式**:
|
||||
- 组件列表:[表格形式]
|
||||
- 连接关系:[描述]
|
||||
- 架构模式:[类型]
|
||||
- 技术栈:[列表]
|
||||
```
|
||||
|
||||
### 3. Data Chart Analysis
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请分析这个数据图表
|
||||
|
||||
**图片**: [图片路径]
|
||||
|
||||
**用户需求**: [提取数据趋势/比较数据/理解指标]
|
||||
|
||||
**分析重点**:
|
||||
1. 图表类型(柱状图、折线图、饼图等)
|
||||
2. X 轴和 Y 轴的标签和范围
|
||||
3. 数据点的具体数值(尽可能读取)
|
||||
4. 趋势、峰值、谷值
|
||||
5. 图例和颜色含义
|
||||
|
||||
**输出格式**:
|
||||
- 图表类型:[类型]
|
||||
- 时间范围:[起止时间]
|
||||
- 数据系列:[列表]
|
||||
- 关键趋势:[描述]
|
||||
- 异常点:[如有]
|
||||
```
|
||||
|
||||
### 4. UI/UX Mockup Analysis
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请分析这个界面设计稿
|
||||
|
||||
**图片**: [图片路径]
|
||||
|
||||
**用户需求**: [实现界面/评估设计/提取需求]
|
||||
|
||||
**分析重点**:
|
||||
1. 界面布局和区域划分
|
||||
2. 所有 UI 元素(按钮、输入框、列表等)
|
||||
3. 文案内容和标签
|
||||
4. 配色方案和字体(如果能识别)
|
||||
5. 交互元素和状态
|
||||
|
||||
**输出格式**:
|
||||
- 布局结构:[描述]
|
||||
- UI 元素清单:[列表]
|
||||
- 配色方案:[颜色值]
|
||||
- 交互说明:[描述]
|
||||
```
|
||||
|
||||
### 5. Flowchart/Process Diagram Analysis
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请分析这个流程图
|
||||
|
||||
**图片**: [图片路径]
|
||||
|
||||
**用户需求**: [理解流程/生成文档/实现逻辑]
|
||||
|
||||
**分析重点**:
|
||||
1. 流程的起点和终点
|
||||
2. 所有步骤/节点的内容
|
||||
3. 决策点和分支条件
|
||||
4. 流程方向和箭头含义
|
||||
5. 并行流程或循环
|
||||
|
||||
**输出格式**:
|
||||
- 流程步骤:[有序列表]
|
||||
- 决策点:[条件 + 分支]
|
||||
- 流程图描述:[文字版]
|
||||
```
|
||||
|
||||
### 6. Table/Data Grid Analysis
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请分析这个表格
|
||||
|
||||
**图片**: [图片路径]
|
||||
|
||||
**用户需求**: [提取数据/理解结构/转换格式]
|
||||
|
||||
**分析重点**:
|
||||
1. 表格的行列结构
|
||||
2. 表头和各列含义
|
||||
3. 所有单元格的数据内容
|
||||
4. 合并单元格(如果有)
|
||||
5. 表格的总计或汇总行
|
||||
|
||||
**输出格式**:
|
||||
- 表格结构:[行数 x 列数]
|
||||
- 列名:[列表]
|
||||
- 数据内容:[Markdown 表格]
|
||||
```
|
||||
|
||||
### 7. Code Screenshot Analysis (OCR)
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请识别这个代码截图中的文字
|
||||
|
||||
**图片**: [图片路径]
|
||||
|
||||
**用户需求**: [提取代码/理解逻辑/转换格式]
|
||||
|
||||
**分析重点**:
|
||||
1. 完整的代码内容(逐行识别)
|
||||
2. 代码语言(根据语法判断)
|
||||
3. 缩进和格式
|
||||
4. 注释内容
|
||||
5. 任何特殊符号
|
||||
|
||||
**输出格式**:
|
||||
- 代码语言:[语言]
|
||||
- 代码内容:[代码块]
|
||||
- 关键逻辑:[简述]
|
||||
```
|
||||
|
||||
### 8. Handwritten Notes Analysis
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请识别这个手写笔记
|
||||
|
||||
**图片**: [图片路径]
|
||||
|
||||
**用户需求**: [转录文字/理解内容/整理笔记]
|
||||
|
||||
**分析重点**:
|
||||
1. 所有可识别的文字内容
|
||||
2. 标题和分段
|
||||
3. 列表和要点
|
||||
4. 图示或草图(如果有)
|
||||
5. 标注和高亮
|
||||
|
||||
**输出格式**:
|
||||
- 标题:[标题]
|
||||
- 内容:[结构化文本]
|
||||
- 要点:[列表]
|
||||
- 备注:[识别不清的部分]
|
||||
```
|
||||
|
||||
### 9. Comparison Task (Multiple Images)
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请对比分析这两张图片
|
||||
|
||||
**图片 1**: [路径 1]
|
||||
**图片 2**: [路径 2]
|
||||
|
||||
**用户需求**: [比较差异/选择更好的/找出变化]
|
||||
|
||||
**分析重点**:
|
||||
1. 每张图片的独立分析
|
||||
2. 相似之处
|
||||
3. 差异之处
|
||||
4. 各自的优缺点
|
||||
|
||||
**输出格式**:
|
||||
- 图片 1 分析:[描述]
|
||||
- 图片 2 分析:[描述]
|
||||
- 相似点:[列表]
|
||||
- 差异点:[列表]
|
||||
- 建议:[如有]
|
||||
```
|
||||
|
||||
### 10. General Purpose (Open-ended)
|
||||
|
||||
```markdown
|
||||
@multimodal-looker 请分析这张图片
|
||||
|
||||
**图片**: [图片路径]
|
||||
|
||||
**用户需求**: [原始需求]
|
||||
|
||||
**分析重点**:
|
||||
1. 图片的整体内容描述
|
||||
2. 关键视觉元素
|
||||
3. 任何文字信息
|
||||
4. 颜色、布局、风格
|
||||
5. 与用户需求相关的部分
|
||||
|
||||
**输出格式**: 自由格式,但请结构化输出
|
||||
```
|
||||
|
||||
## Response Integration Patterns
|
||||
|
||||
After receiving analysis from multimodal-looker, integrate results:
|
||||
|
||||
### Pattern 1: Acknowledge + Connect
|
||||
```markdown
|
||||
感谢分析。我看到了 [图片内容简述]。
|
||||
|
||||
根据你的需求 [xxx],结合图片中的 [关键信息],我建议...
|
||||
```
|
||||
|
||||
### Pattern 2: Summary + Action
|
||||
```markdown
|
||||
根据图片分析,关键信息是:
|
||||
1. [要点 1]
|
||||
2. [要点 2]
|
||||
|
||||
基于此,下一步行动是...
|
||||
```
|
||||
|
||||
### Pattern 3: Validation + Expansion
|
||||
```markdown
|
||||
图片分析结果确认了 [某信息]。
|
||||
|
||||
除此之外,还需要考虑...
|
||||
```
|
||||
|
||||
## Error Handling Templates
|
||||
|
||||
### Timeout Response
|
||||
```markdown
|
||||
抱歉,图片分析超时了。可能原因:
|
||||
- 图片文件过大
|
||||
- 网络延迟
|
||||
- 服务繁忙
|
||||
|
||||
你可以:
|
||||
1. 压缩图片后重试
|
||||
2. 用文字描述关键信息
|
||||
3. 稍后重试
|
||||
```
|
||||
|
||||
### Format Not Supported
|
||||
```markdown
|
||||
这个图片格式([格式])可能不被支持。
|
||||
|
||||
建议:
|
||||
- 转换为 PNG 或 JPG 格式
|
||||
- 或者描述图片内容,我继续帮助你
|
||||
```
|
||||
|
||||
### Analysis Failed
|
||||
```markdown
|
||||
抱歉,无法分析这张图片。可能原因:
|
||||
- 图片文件损坏
|
||||
- 图片内容无法识别
|
||||
- 服务暂时不可用
|
||||
|
||||
请尝试:
|
||||
- 重新上传清晰的图片
|
||||
- 或者用文字描述你想了解的内容
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### DO:
|
||||
1. Be specific about what you want analyzed
|
||||
2. Provide context about why you're asking
|
||||
3. Specify output format if you have preferences
|
||||
4. Acknowledge the analysis when received
|
||||
5. Connect visual info to the broader task
|
||||
|
||||
### DON'T:
|
||||
1. Don't just say "analyze this" without context
|
||||
2. Don't expect perfect OCR on blurry images
|
||||
3. Don't ignore the analysis results
|
||||
4. Don't delegate if the image isn't crucial
|
||||
5. Don't forget to thank multimodal-looker (optional but nice)
|
||||
|
||||
## Example Conversations
|
||||
|
||||
### Example 1: Error Debugging
|
||||
```
|
||||
User: 帮我看看这个报错 screenshot.png
|
||||
|
||||
Agent: @multimodal-looker 请分析这个错误日志截图...
|
||||
[delegation with template 1]
|
||||
|
||||
multimodal-looker: 分析结果:
|
||||
- 错误类型:NullPointerException
|
||||
- 位置:UserService.java:42
|
||||
- 原因:user 对象为 null
|
||||
- 建议:添加 null 检查
|
||||
|
||||
Agent: 我看到了这个错误,是 UserService 第 42 行的空指针异常。
|
||||
根据你的需求,解决方案是:
|
||||
1. 在调用 user 之前检查是否为 null
|
||||
2. 或者使用 Optional 包装
|
||||
...
|
||||
```
|
||||
|
||||
### Example 2: Architecture Review
|
||||
```
|
||||
User: 根据这个架构图 arch.png,帮我设计部署方案
|
||||
|
||||
Agent: @multimodal-looker 请分析这个架构图...
|
||||
[delegation with template 2]
|
||||
|
||||
multimodal-looker: 分析结果:
|
||||
- 组件:Frontend(React), API Gateway(Nginx), Backend(Node.js), DB(PostgreSQL)
|
||||
- 连接:React → Nginx → Node.js → PostgreSQL
|
||||
- 架构:典型的三层架构
|
||||
|
||||
Agent: 根据架构图,系统包含三层:前端、API 层、数据层。
|
||||
基于此,部署方案如下:
|
||||
1. 前端:部署到 CDN 或 S3
|
||||
2. API 层:Docker 容器化,负载均衡
|
||||
3. 数据库:主从复制,定期备份
|
||||
...
|
||||
```
|
||||
@@ -0,0 +1,195 @@
|
||||
# Visual Content Detection Patterns
|
||||
|
||||
Complete reference for detecting visual content in user inputs.
|
||||
|
||||
## File Extension Patterns
|
||||
|
||||
### Image Files (High Priority)
|
||||
```
|
||||
.png, .jpg, .jpeg, .gif, .bmp, .webp, .svg, .ico, .tiff, .tif, .heic, .heif, .raw, .psd, .ai, .eps
|
||||
```
|
||||
|
||||
**Detection Rule**: Case-insensitive match anywhere in input
|
||||
|
||||
### Document Files with Visual Content (Medium Priority)
|
||||
```
|
||||
.pdf (may contain diagrams), .ppt, .pptx (slides with visuals), .vsdx (Visio), .drawio
|
||||
```
|
||||
|
||||
**Detection Rule**: File extension + visual keywords
|
||||
|
||||
## Keyword Patterns
|
||||
|
||||
### Chinese Visual Keywords
|
||||
```
|
||||
一级关键词(高优先级):
|
||||
图片,图像,照片,截图,图表,图示,图形,影像,画面
|
||||
|
||||
二级关键词(中优先级):
|
||||
流程图,架构图,时序图,ER 图,思维导图,柱状图,饼图,折线图
|
||||
设计图,原型图,线框图,界面,UI,UX
|
||||
表格,表单,清单,列表
|
||||
|
||||
三级关键词(低优先级):
|
||||
显示,展示,呈现,可视化,看图,读图
|
||||
```
|
||||
|
||||
### English Visual Keywords
|
||||
```
|
||||
High Priority:
|
||||
image, photo, picture, screenshot, snapshot, capture, diagram, chart, graph, plot, figure
|
||||
|
||||
Medium Priority:
|
||||
flowchart, architecture, sequence diagram, ER diagram, mind map, bar chart, pie chart, line graph
|
||||
design, mockup, wireframe, interface, UI, UX, layout
|
||||
table, form, list, grid
|
||||
|
||||
Low Priority:
|
||||
show, display, visualize, view, look at, see
|
||||
```
|
||||
|
||||
### Technical Visual Keywords
|
||||
```
|
||||
Schema, model, blueprint, spec, technical drawing
|
||||
Dashboard, widget, panel, visualization
|
||||
Map, heatmap, scatter plot, histogram
|
||||
Infographic, poster, banner, thumbnail
|
||||
```
|
||||
|
||||
## Pattern Matching Rules
|
||||
|
||||
### Rule 1: File Path + Extension
|
||||
```regex
|
||||
[\w\-\.\/]+?\.(png|jpg|jpeg|gif|bmp|webp|svg|ico|tiff|heic)
|
||||
```
|
||||
|
||||
**Action**: Immediate delegation to multimodal-looker
|
||||
|
||||
### Rule 2: Markdown Image Syntax
|
||||
```regex
|
||||
!\[([^\]]*)\]\(([^\)]+)\)
|
||||
```
|
||||
|
||||
**Action**: Extract alt text and URL, delegate to multimodal-looker
|
||||
|
||||
### Rule 3: Base64 Image Data
|
||||
```regex
|
||||
data:image\/(png|jpeg|gif|webp);base64,[A-Za-z0-9+/=]+
|
||||
```
|
||||
|
||||
**Action**: Extract base64 data, save to temp file, delegate
|
||||
|
||||
### Rule 4: Keyword + File Reference
|
||||
```
|
||||
(图片 | 图像|diagram|chart|screenshot).*?[\w\-\.\/]+\.(png|jpg|jpeg|gif|bmp|webp)
|
||||
```
|
||||
|
||||
**Action**: Confirm intent, then delegate
|
||||
|
||||
### Rule 5: Keyword Only (Ambiguous)
|
||||
```
|
||||
(帮我看看这个图 | 分析这张图片 | 这个图表显示)
|
||||
```
|
||||
|
||||
**Action**: Ask for clarification: "请问是哪张图片?"
|
||||
|
||||
## Context-Aware Detection
|
||||
|
||||
### Code Development Context
|
||||
When user is working on code:
|
||||
- `architecture.png` → Architecture diagram
|
||||
- `screenshot.png` → Error or UI screenshot
|
||||
- `mockup.jpg` → Design reference
|
||||
|
||||
**Action**: Assume technical visual, delegate with context
|
||||
|
||||
### Data Analysis Context
|
||||
When user mentions data:
|
||||
- `chart`, `graph`, `plot`, `visualization`
|
||||
- `sales_chart.png`, `trend_graph.jpg`
|
||||
|
||||
**Action**: Assume data visualization, request data extraction
|
||||
|
||||
### Design Context
|
||||
When user discusses design:
|
||||
- `mockup`, `wireframe`, `prototype`, `design`
|
||||
- `ui_design.png`, `wireframe.jpg`
|
||||
|
||||
**Action**: Assume design visual, request UI/UX analysis
|
||||
|
||||
## Detection Confidence Levels
|
||||
|
||||
| Level | Confidence | Triggers | Action |
|
||||
|-------|------------|----------|--------|
|
||||
| HIGH | 90-100% | Image file + visual keyword | Auto-delegate |
|
||||
| MEDIUM | 60-89% | Image file OR strong keyword | Confirm then delegate |
|
||||
| LOW | 30-59% | Weak keyword only | Ask for clarification |
|
||||
| NONE | 0-29% | No visual signals | Process as text |
|
||||
|
||||
## Edge Cases
|
||||
|
||||
### Ambiguous References
|
||||
```
|
||||
"看这个" (without specifying what)
|
||||
"这个文件" (could be text or image)
|
||||
```
|
||||
**Handling**: Ask "请问是哪个文件?是图片吗?"
|
||||
|
||||
### Multiple Images
|
||||
```
|
||||
"比较这两张图:img1.png 和 img2.png"
|
||||
```
|
||||
**Handling**: Delegate both, request comparison
|
||||
|
||||
### Image in Code Block
|
||||
````
|
||||
```
|
||||

|
||||
```
|
||||
````
|
||||
**Handling**: Still detect as visual content (user may be documenting)
|
||||
|
||||
### URL Images
|
||||
```
|
||||
https://example.com/image.png
|
||||
http://cdn.site.com/chart.jpg
|
||||
```
|
||||
**Handling**: Detect as visual, may need download first
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
- [ ] Scan input for file extensions
|
||||
- [ ] Check for markdown image syntax
|
||||
- [ ] Search for visual keywords
|
||||
- [ ] Evaluate context (code, data, design)
|
||||
- [ ] Assign confidence level
|
||||
- [ ] Execute appropriate action (delegate/confirm/ask)
|
||||
|
||||
## Testing Examples
|
||||
|
||||
### Should Trigger (High Confidence)
|
||||
```
|
||||
分析这个截图:error.png
|
||||
看这张架构图 design/architecture.png
|
||||
 显示什么?
|
||||
帮我看看 data:image/png;base64,...
|
||||
```
|
||||
|
||||
### Should Trigger (Medium Confidence)
|
||||
```
|
||||
这个图片怎么优化?screenshot.png
|
||||
diagram.jpg 有什么改进建议
|
||||
```
|
||||
|
||||
### Should Ask (Low Confidence)
|
||||
```
|
||||
帮我看看这个图 (no file specified)
|
||||
这个设计怎么样?(unclear if visual attached)
|
||||
```
|
||||
|
||||
### Should Not Trigger
|
||||
```
|
||||
帮我写代码
|
||||
这个文本怎么格式化
|
||||
纯文字内容
|
||||
```
|
||||
@@ -0,0 +1,443 @@
|
||||
# Failure Handling and Graceful Degradation
|
||||
|
||||
Comprehensive guide for handling visual processing failures.
|
||||
|
||||
## Failure Scenarios
|
||||
|
||||
### Scenario 1: multimodal-looker Agent Unavailable
|
||||
|
||||
**Symptoms:**
|
||||
- No response from @multimodal-looker
|
||||
- Error: "Agent not found" or "Service unavailable"
|
||||
- Timeout after 60+ seconds
|
||||
|
||||
**Detection:**
|
||||
```python
|
||||
if agent_response.status == "timeout" or "unavailable" in error_message:
|
||||
trigger_failure_handling("agent_unavailable")
|
||||
```
|
||||
|
||||
**Response Template:**
|
||||
```markdown
|
||||
抱歉,视觉分析服务暂时不可用。
|
||||
|
||||
**可能原因**:
|
||||
- multimodal-looker 服务正在维护
|
||||
- 网络连接问题
|
||||
- 服务负载过高
|
||||
|
||||
**替代方案**:
|
||||
1. **稍后重试**: 等待 5-10 分钟后再次尝试
|
||||
2. **文字描述**: 请用文字描述图片内容,我可以继续帮助你
|
||||
3. **手动分析**: 如果你能提供图片的关键信息,我可以基于此给出建议
|
||||
|
||||
**需要我帮你做什么**:
|
||||
- [ ] 稍后自动重试
|
||||
- [ ] 继续其他任务(先跳过图片分析)
|
||||
- [ ] 你用文字描述,我继续处理
|
||||
```
|
||||
|
||||
**Follow-up Actions:**
|
||||
- Log the failure for monitoring
|
||||
- Offer to retry after delay
|
||||
- Proceed with text-only workflow if possible
|
||||
|
||||
---
|
||||
|
||||
### Scenario 2: Image Format Not Supported
|
||||
|
||||
**Symptoms:**
|
||||
- Error: "Unsupported image format"
|
||||
- Error: "Cannot process file type"
|
||||
- Recognition rate very low
|
||||
|
||||
**Supported Formats:**
|
||||
```
|
||||
✅ Supported: PNG, JPG/JPEG, GIF, BMP, WebP, SVG (rasterized)
|
||||
❌ Unsupported: PSD, AI, EPS, RAW, HEIC (without conversion)
|
||||
⚠️ Limited: PDF (first page only), TIFF (may be slow)
|
||||
```
|
||||
|
||||
**Response Template:**
|
||||
```markdown
|
||||
这个图片格式(`.xxx`)可能不被支持。
|
||||
|
||||
**当前支持的格式**:
|
||||
- ✅ PNG, JPG/JPEG, GIF, BMP, WebP
|
||||
- ⚠️ SVG, TIFF, PDF(有限支持)
|
||||
|
||||
**建议操作**:
|
||||
1. **转换格式**:
|
||||
- 使用在线工具转换为 PNG 或 JPG
|
||||
- 推荐工具:[工具名称/链接]
|
||||
|
||||
2. **截图替代**:
|
||||
- 如果是设计文件,可以截图为 PNG
|
||||
|
||||
3. **文字描述**:
|
||||
- 描述图片内容,我继续帮助你
|
||||
|
||||
**需要帮助转换吗**?我可以提供具体的转换步骤。
|
||||
```
|
||||
|
||||
**Recovery Options:**
|
||||
- Provide format conversion tools
|
||||
- Suggest screenshot as alternative
|
||||
- Accept text description
|
||||
|
||||
---
|
||||
|
||||
### Scenario 3: Image File Not Found
|
||||
|
||||
**Symptoms:**
|
||||
- Error: "File not found"
|
||||
- Error: "Cannot access file"
|
||||
- Empty response
|
||||
|
||||
**Detection:**
|
||||
```python
|
||||
if "file not found" in error or "cannot access" in error:
|
||||
trigger_failure_handling("file_not_found")
|
||||
```
|
||||
|
||||
**Response Template:**
|
||||
```markdown
|
||||
找不到图片文件:`[文件路径]`
|
||||
|
||||
**可能原因**:
|
||||
1. ❌ 文件路径不正确
|
||||
2. ❌ 文件已被删除或移动
|
||||
3. ❌ 没有访问权限
|
||||
4. ❌ 文件正在被其他程序使用
|
||||
|
||||
**请检查**:
|
||||
- [ ] 文件路径是否完整且正确
|
||||
- [ ] 文件是否确实存在于该位置
|
||||
- [ ] 是否有读取该文件的权限
|
||||
- [ ] 文件是否被其他程序锁定
|
||||
|
||||
**解决方案**:
|
||||
1. **重新提供路径**: 请确认正确的文件路径
|
||||
2. **重新上传**: 如果可以,请重新上传图片
|
||||
3. **使用绝对路径**: 尝试使用完整的绝对路径
|
||||
|
||||
**示例**:
|
||||
```
|
||||
错误:./image.png
|
||||
正确:D:/projects/my-app/assets/image.png
|
||||
```
|
||||
```
|
||||
|
||||
**Debugging Steps:**
|
||||
1. Verify file path syntax
|
||||
2. Check if file exists
|
||||
3. Verify permissions
|
||||
4. Try absolute path
|
||||
|
||||
---
|
||||
|
||||
### Scenario 4: Analysis Timeout
|
||||
|
||||
**Symptoms:**
|
||||
- No response after 60 seconds
|
||||
- Error: "Request timeout"
|
||||
- Partial response then silence
|
||||
|
||||
**Timeout Thresholds:**
|
||||
```
|
||||
Normal image (< 5MB): 10-30 seconds
|
||||
Large image (5-20MB): 30-60 seconds
|
||||
Very large (> 20MB): May timeout
|
||||
Complex diagram: 20-40 seconds
|
||||
```
|
||||
|
||||
**Response Template:**
|
||||
```markdown
|
||||
图片分析超时(等待超过 60 秒)。
|
||||
|
||||
**可能原因**:
|
||||
1. 🐌 图片文件过大(超过 20MB)
|
||||
2. 🌐 网络延迟或不稳定
|
||||
3. 🔄 服务繁忙,处理队列长
|
||||
4. 🖼️ 图片内容过于复杂
|
||||
|
||||
**建议操作**:
|
||||
1. **压缩图片**:
|
||||
- 目标大小:< 5MB
|
||||
- 推荐工具:TinyPNG, Squoosh
|
||||
|
||||
2. **降低分辨率**:
|
||||
- 建议尺寸:1920x1080 或更低
|
||||
- 保持清晰度即可
|
||||
|
||||
3. **简化内容**:
|
||||
- 如果是长图,可以分段发送
|
||||
- 如果是多页,可以分页发送
|
||||
|
||||
4. **重试**:
|
||||
- 等待几分钟后再次尝试
|
||||
- 网络可能暂时不稳定
|
||||
|
||||
**需要我帮你压缩图片吗**?或者你可以先描述关键信息,我继续处理。
|
||||
```
|
||||
|
||||
**Optimization Tips:**
|
||||
- Recommend image compression
|
||||
- Suggest resolution reduction
|
||||
- Offer to split large images
|
||||
|
||||
---
|
||||
|
||||
### Scenario 5: Poor Image Quality
|
||||
|
||||
**Symptoms:**
|
||||
- Low confidence in analysis
|
||||
- "Cannot read text clearly"
|
||||
- Missing details in response
|
||||
- Blurry or distorted recognition
|
||||
|
||||
**Quality Issues:**
|
||||
```
|
||||
❌ Blurry: Out of focus, motion blur
|
||||
❌ Dark: Underexposed, low contrast
|
||||
❌ Small text: Resolution too low
|
||||
❌ Glare: Reflection, overexposed
|
||||
❌ Cropped: Important content cut off
|
||||
```
|
||||
|
||||
**Response Template:**
|
||||
```markdown
|
||||
图片分析完成,但质量不高,可能影响准确性。
|
||||
|
||||
**识别到的问题**:
|
||||
- ⚠️ 图片模糊,文字识别困难
|
||||
- ⚠️ 光线不足,部分细节不清
|
||||
- ⚠️ 分辨率低,小字无法辨认
|
||||
|
||||
**当前分析结果**(仅供参考):
|
||||
[分析结果,标注置信度]
|
||||
|
||||
**建议改进**:
|
||||
1. **重新拍摄**:
|
||||
- 确保光线充足
|
||||
- 保持相机稳定
|
||||
- 对焦清晰
|
||||
|
||||
2. **提高分辨率**:
|
||||
- 使用更高 DPI 扫描
|
||||
- 截图而非拍照
|
||||
|
||||
3. **调整角度**:
|
||||
- 避免反光和阴影
|
||||
- 正面拍摄,不要倾斜
|
||||
|
||||
**或者**:
|
||||
- 你可以直接告诉我图片中的关键信息
|
||||
- 或者指出你关心的具体部分,我尽力分析
|
||||
```
|
||||
|
||||
**Workarounds:**
|
||||
- Provide best-effort analysis with confidence disclaimer
|
||||
- Ask user to clarify specific areas
|
||||
- Accept text input for critical information
|
||||
|
||||
---
|
||||
|
||||
### Scenario 6: Partial Analysis (Incomplete Results)
|
||||
|
||||
**Symptoms:**
|
||||
- Response cuts off mid-sentence
|
||||
- Only partial image analyzed
|
||||
- Missing requested information
|
||||
|
||||
**Response Template:**
|
||||
```markdown
|
||||
图片分析似乎不完整,只获取了部分结果。
|
||||
|
||||
**已获取的信息**:
|
||||
- [列出已分析的内容]
|
||||
|
||||
**缺失的信息**:
|
||||
- [列出未分析的部分]
|
||||
|
||||
**建议操作**:
|
||||
1. **请求补充分析**:
|
||||
@multimodal-looker 请补充分析图片的 [具体部分]
|
||||
|
||||
2. **分段处理**:
|
||||
如果是长图或复杂图片,可以分段发送分析
|
||||
|
||||
3. **手动补充**:
|
||||
你可以补充缺失部分的信息,我整合分析
|
||||
|
||||
**需要我请求补充分析吗**?或者你提供额外信息?
|
||||
```
|
||||
|
||||
**Recovery:**
|
||||
- Request targeted re-analysis
|
||||
- Ask user to provide missing context
|
||||
- Combine partial results with text input
|
||||
|
||||
---
|
||||
|
||||
## Escalation Protocol
|
||||
|
||||
### Level 1: Automatic Retry
|
||||
```
|
||||
First failure → Wait 10 seconds → Retry once
|
||||
```
|
||||
|
||||
### Level 2: Alternative Approach
|
||||
```
|
||||
Second failure → Suggest alternative (compression, format conversion, text description)
|
||||
```
|
||||
|
||||
### Level 3: Human Intervention
|
||||
```
|
||||
Third failure → Acknowledge limitation → Request manual input
|
||||
```
|
||||
|
||||
### Level 4: Task Redesign
|
||||
```
|
||||
Persistent failure → Propose alternative workflow → Bypass visual processing
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recovery Strategies
|
||||
|
||||
### Strategy 1: Text-Based Alternative
|
||||
```markdown
|
||||
既然图片分析不可用,我们可以用文字方式继续:
|
||||
|
||||
**请你描述**:
|
||||
1. 图片的主要内容是什么?
|
||||
2. 你关心图片中的哪些部分?
|
||||
3. 你希望基于图片完成什么任务?
|
||||
|
||||
**我会根据描述**:
|
||||
- 提供针对性建议
|
||||
- 继续后续工作
|
||||
- 必要时再尝试图片分析
|
||||
```
|
||||
|
||||
### Strategy 2: Incremental Processing
|
||||
```markdown
|
||||
我们可以分步处理:
|
||||
|
||||
**步骤 1**: 你先描述图片概要
|
||||
**步骤 2**: 我基于概要提供初步建议
|
||||
**步骤 3**: 针对关键点,我们再尝试图片分析
|
||||
**步骤 4**: 整合结果,完成整体任务
|
||||
|
||||
这样即使某一步失败,也能继续推进。
|
||||
```
|
||||
|
||||
### Strategy 3: External Tool Assistance
|
||||
```markdown
|
||||
如果内置工具不可用,可以尝试外部工具:
|
||||
|
||||
**推荐工具**:
|
||||
- OCR: 白描、ABBYY FineReader
|
||||
- 图表分析:ChartExpo、Plotly
|
||||
- 架构图:Draw.io、Lucidchart
|
||||
|
||||
**流程**:
|
||||
1. 使用外部工具分析
|
||||
2. 导出分析结果(文字/数据)
|
||||
3. 我基于结果继续处理
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Monitoring and Logging
|
||||
|
||||
### Failure Metrics to Track
|
||||
```
|
||||
- Total visual requests
|
||||
- Success rate
|
||||
- Average response time
|
||||
- Failure reasons distribution
|
||||
- Retry success rate
|
||||
```
|
||||
|
||||
### Log Format
|
||||
```json
|
||||
{
|
||||
"timestamp": "2026-02-26T10:30:00Z",
|
||||
"task_id": "vision-12345",
|
||||
"failure_type": "timeout",
|
||||
"image_size": "15MB",
|
||||
"retry_count": 2,
|
||||
"resolution": "text_description",
|
||||
"user_satisfaction": "pending"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## User Communication Best Practices
|
||||
|
||||
### DO:
|
||||
1. ✅ Acknowledge the failure promptly
|
||||
2. ✅ Explain the reason clearly (without technical jargon)
|
||||
3. ✅ Provide concrete alternatives
|
||||
4. ✅ Offer specific next steps
|
||||
5. ✅ Maintain positive, helpful tone
|
||||
|
||||
### DON'T:
|
||||
1. ❌ Blame the user ("你的图片有问题")
|
||||
2. ❌ Make excuses without solutions
|
||||
3. ❌ Give up without alternatives
|
||||
4. ❌ Use vague language ("可能", "也许")
|
||||
5. ❌ Repeat the same failed approach
|
||||
|
||||
### Example Phrases
|
||||
|
||||
**Good**:
|
||||
- "抱歉,图片分析遇到了问题。我们可以试试这样的替代方案..."
|
||||
- "虽然无法分析图片,但如果你能描述 [关键信息],我可以继续帮助你..."
|
||||
- "这个格式暂时不支持,建议转换为 PNG。需要我提供转换工具吗?"
|
||||
|
||||
**Bad**:
|
||||
- "这个图片不行" (太模糊)
|
||||
- "我处理不了" (放弃态度)
|
||||
- "你换个图片" (推卸责任)
|
||||
|
||||
---
|
||||
|
||||
## Testing Checklist
|
||||
|
||||
Test failure handling for:
|
||||
|
||||
- [ ] Agent timeout (simulate delay)
|
||||
- [ ] Unsupported format (send .psd file)
|
||||
- [ ] File not found (invalid path)
|
||||
- [ ] Large file timeout (> 20MB)
|
||||
- [ ] Blurry image (low quality)
|
||||
- [ ] Partial response (cut off mid-analysis)
|
||||
- [ ] Network error (disconnect)
|
||||
- [ ] Service error (mock 500 response)
|
||||
|
||||
For each test:
|
||||
- [ ] Appropriate error message shown
|
||||
- [ ] Alternatives provided
|
||||
- [ ] User can continue task
|
||||
- [ ] No crash or hang
|
||||
- [ ] Logs recorded
|
||||
|
||||
---
|
||||
|
||||
## Continuous Improvement
|
||||
|
||||
After each failure:
|
||||
1. Record failure type and context
|
||||
2. Analyze root cause
|
||||
3. Update detection/prevention logic
|
||||
4. Improve response templates
|
||||
5. Test with similar scenarios
|
||||
|
||||
Share learnings with:
|
||||
- Skill maintainers
|
||||
- OMO development team
|
||||
- multimodal-looker operators
|
||||
Reference in New Issue
Block a user