Initial commit: skills library

- 70 skills with code and documentation - Add .gitignore (ignore __pycache__, output/, temp/, venv/) - Clean up test intermediates and caches
2026-04-26 19:27:40 +08:00
commit 04db423416
861 changed files with 210414 additions and 0 deletions
@@ -0,0 +1,103 @@
+# Agent Vision Awareness - Usage Guide
+
+## Quick Setup
+
+1. **API Key 配置**:
+   - 火山方舟 API Key 已在 OpenCode 配置中
+   - 或设置 `VOLCENGINE_API_KEY` 环境变量
+
+2. **Add skill to OMO configuration**:
+   ```yaml
+   skills:
+     - agent-vision-awareness
+   ```
+
+3. **Use naturally** - just mention images in your requests:
+   - "分析这个截图 error.png"
+   - "描述 temp/image.jpg 的内容"
+   - "根据架构图 design/architecture.png 生成部署方案"
+
+## Integration Examples
+
+### Basic Usage (Automatic)
+```python
+# No special code needed - automatic detection and processing
+user_input = "帮我分析这个错误日志截图：./logs/error.png"
+# The skill will automatically detect and process the image
+```
+
+### Manual Integration (When Needed)
+```python
+from .scripts.integrate_vision import process_user_input
+
+# Process user input with visual content
+result = process_user_input(
+    user_input="分析图表 sales_chart.png",
+    user_request="提取销售数据趋势",
+    config={
+        "api_key": os.environ.get("VOLCENGINE_API_KEY"),
+        "base_url": "https://ark.cn-beijing.volces.com/api/coding/v3",
+        "model": "doubao-seed-code"
+    }
+)
+
+if result["confidence"] != "none":
+    analysis = result["analysis_results"][0]["result"]
+    # Use analysis in your response
+    response = f"根据图片分析：{analysis}"
+else:
+    # Handle as normal text-only request
+    response = "处理文本请求..."
+```
+
+## Configuration
+
+Copy `config/settings.json.example` to `config/settings.json` and update with your API key:
+
+```json
+{
+  "vision_api": {
+    "key": "b0359bed-09f2-49e2-a53c-32ba057412e3",
+    "base_url": "https://ark.cn-beijing.volces.com/api/coding/v3",
+    "model": "doubao-seed-code"
+  }
+}
+```
+
+## Supported Features
+
+✅ **Automatic Detection**: File extensions, keywords, URLs, markdown syntax  
+✅ **Multiple Analysis Modes**: OCR, chart analysis, product analysis, scene description  
+✅ **Error Handling**: Graceful degradation with clear error messages  
+✅ **File Support**: Local files, relative paths, absolute paths, URLs  
+✅ **Format Support**: PNG, JPG, JPEG, WebP, GIF, BMP  
+
+## Limitations
+
+⚠️ **No Custom Agent Delegation**: The `@multimodal-looker` approach doesn't work with current OMO  
+⚠️ **API Key Required**: Must have valid 火山方舟 API key  
+⚠️ **File Size**: Images should be < 4MB for optimal performance  
+⚠️ **Network**: Requires internet access to `https://ark.cn-beijing.volces.com`
+
+## Troubleshooting
+
+**Vision processing not working?**
+- Check `VOLCENGINE_API_KEY` configuration
+- Verify image file exists and is accessible
+- Test with simple request: "描述 image.png"
+
+**Detection not triggering?**
+- Ensure input contains detectable patterns (file extensions like `.png`, keywords like "图片")
+- Use explicit file paths instead of vague references
+
+**API errors?**
+- Check network connectivity to 火山方舟 API
+- Verify API key is valid
+- Check rate limits on your account
+
+## Related Skills
+
+- `image-service`: For image generation and editing
+- `file-reader`: For reading document contents (complementary)
+
+This skill provides **fully automatic visual content processing** without requiring manual intervention or custom agent commands.