# Agent Vision Awareness - Usage Guide ## Quick Setup 1. **API Key 配置**: - 火山方舟 API Key 已在 OpenCode 配置中 - 或设置 `VOLCENGINE_API_KEY` 环境变量 2. **Add skill to OMO configuration**: ```yaml skills: - agent-vision-awareness ``` 3. **Use naturally** - just mention images in your requests: - "分析这个截图 error.png" - "描述 temp/image.jpg 的内容" - "根据架构图 design/architecture.png 生成部署方案" ## Integration Examples ### Basic Usage (Automatic) ```python # No special code needed - automatic detection and processing user_input = "帮我分析这个错误日志截图:./logs/error.png" # The skill will automatically detect and process the image ``` ### Manual Integration (When Needed) ```python from .scripts.integrate_vision import process_user_input # Process user input with visual content result = process_user_input( user_input="分析图表 sales_chart.png", user_request="提取销售数据趋势", config={ "api_key": os.environ.get("VOLCENGINE_API_KEY"), "base_url": "https://ark.cn-beijing.volces.com/api/coding/v3", "model": "doubao-seed-code" } ) if result["confidence"] != "none": analysis = result["analysis_results"][0]["result"] # Use analysis in your response response = f"根据图片分析:{analysis}" else: # Handle as normal text-only request response = "处理文本请求..." ``` ## Configuration Copy `config/settings.json.example` to `config/settings.json` and update with your API key: ```json { "vision_api": { "key": "b0359bed-09f2-49e2-a53c-32ba057412e3", "base_url": "https://ark.cn-beijing.volces.com/api/coding/v3", "model": "doubao-seed-code" } } ``` ## Supported Features ✅ **Automatic Detection**: File extensions, keywords, URLs, markdown syntax ✅ **Multiple Analysis Modes**: OCR, chart analysis, product analysis, scene description ✅ **Error Handling**: Graceful degradation with clear error messages ✅ **File Support**: Local files, relative paths, absolute paths, URLs ✅ **Format Support**: PNG, JPG, JPEG, WebP, GIF, BMP ## Limitations ⚠️ **No Custom Agent Delegation**: The `@multimodal-looker` approach doesn't work with current OMO ⚠️ **API Key Required**: Must have valid 火山方舟 API key ⚠️ **File Size**: Images should be < 4MB for optimal performance ⚠️ **Network**: Requires internet access to `https://ark.cn-beijing.volces.com` ## Troubleshooting **Vision processing not working?** - Check `VOLCENGINE_API_KEY` configuration - Verify image file exists and is accessible - Test with simple request: "描述 image.png" **Detection not triggering?** - Ensure input contains detectable patterns (file extensions like `.png`, keywords like "图片") - Use explicit file paths instead of vague references **API errors?** - Check network connectivity to 火山方舟 API - Verify API key is valid - Check rate limits on your account ## Related Skills - `image-service`: For image generation and editing - `file-reader`: For reading document contents (complementary) This skill provides **fully automatic visual content processing** without requiring manual intervention or custom agent commands.