- 70 skills with code and documentation - Add .gitignore (ignore __pycache__, output/, temp/, venv/) - Clean up test intermediates and caches
3.2 KiB
Agent Vision Awareness - Usage Guide
Quick Setup
-
API Key 配置:
- 火山方舟 API Key 已在 OpenCode 配置中
- 或设置
VOLCENGINE_API_KEY环境变量
-
Add skill to OMO configuration:
skills: - agent-vision-awareness -
Use naturally - just mention images in your requests:
- "分析这个截图 error.png"
- "描述 temp/image.jpg 的内容"
- "根据架构图 design/architecture.png 生成部署方案"
Integration Examples
Basic Usage (Automatic)
# No special code needed - automatic detection and processing
user_input = "帮我分析这个错误日志截图:./logs/error.png"
# The skill will automatically detect and process the image
Manual Integration (When Needed)
from .scripts.integrate_vision import process_user_input
# Process user input with visual content
result = process_user_input(
user_input="分析图表 sales_chart.png",
user_request="提取销售数据趋势",
config={
"api_key": os.environ.get("VOLCENGINE_API_KEY"),
"base_url": "https://ark.cn-beijing.volces.com/api/coding/v3",
"model": "doubao-seed-code"
}
)
if result["confidence"] != "none":
analysis = result["analysis_results"][0]["result"]
# Use analysis in your response
response = f"根据图片分析:{analysis}"
else:
# Handle as normal text-only request
response = "处理文本请求..."
Configuration
Copy config/settings.json.example to config/settings.json and update with your API key:
{
"vision_api": {
"key": "b0359bed-09f2-49e2-a53c-32ba057412e3",
"base_url": "https://ark.cn-beijing.volces.com/api/coding/v3",
"model": "doubao-seed-code"
}
}
Supported Features
✅ Automatic Detection: File extensions, keywords, URLs, markdown syntax
✅ Multiple Analysis Modes: OCR, chart analysis, product analysis, scene description
✅ Error Handling: Graceful degradation with clear error messages
✅ File Support: Local files, relative paths, absolute paths, URLs
✅ Format Support: PNG, JPG, JPEG, WebP, GIF, BMP
Limitations
⚠️ No Custom Agent Delegation: The @multimodal-looker approach doesn't work with current OMO
⚠️ API Key Required: Must have valid 火山方舟 API key
⚠️ File Size: Images should be < 4MB for optimal performance
⚠️ Network: Requires internet access to https://ark.cn-beijing.volces.com
Troubleshooting
Vision processing not working?
- Check
VOLCENGINE_API_KEYconfiguration - Verify image file exists and is accessible
- Test with simple request: "描述 image.png"
Detection not triggering?
- Ensure input contains detectable patterns (file extensions like
.png, keywords like "图片") - Use explicit file paths instead of vague references
API errors?
- Check network connectivity to 火山方舟 API
- Verify API key is valid
- Check rate limits on your account
Related Skills
image-service: For image generation and editingfile-reader: For reading document contents (complementary)
This skill provides fully automatic visual content processing without requiring manual intervention or custom agent commands.