Initial commit: skills library
- 70 skills with code and documentation - Add .gitignore (ignore __pycache__, output/, temp/, venv/) - Clean up test intermediates and caches
This commit is contained in:
@@ -0,0 +1,103 @@
|
||||
# Agent Vision Awareness - Usage Guide
|
||||
|
||||
## Quick Setup
|
||||
|
||||
1. **API Key 配置**:
|
||||
- 火山方舟 API Key 已在 OpenCode 配置中
|
||||
- 或设置 `VOLCENGINE_API_KEY` 环境变量
|
||||
|
||||
2. **Add skill to OMO configuration**:
|
||||
```yaml
|
||||
skills:
|
||||
- agent-vision-awareness
|
||||
```
|
||||
|
||||
3. **Use naturally** - just mention images in your requests:
|
||||
- "分析这个截图 error.png"
|
||||
- "描述 temp/image.jpg 的内容"
|
||||
- "根据架构图 design/architecture.png 生成部署方案"
|
||||
|
||||
## Integration Examples
|
||||
|
||||
### Basic Usage (Automatic)
|
||||
```python
|
||||
# No special code needed - automatic detection and processing
|
||||
user_input = "帮我分析这个错误日志截图:./logs/error.png"
|
||||
# The skill will automatically detect and process the image
|
||||
```
|
||||
|
||||
### Manual Integration (When Needed)
|
||||
```python
|
||||
from .scripts.integrate_vision import process_user_input
|
||||
|
||||
# Process user input with visual content
|
||||
result = process_user_input(
|
||||
user_input="分析图表 sales_chart.png",
|
||||
user_request="提取销售数据趋势",
|
||||
config={
|
||||
"api_key": os.environ.get("VOLCENGINE_API_KEY"),
|
||||
"base_url": "https://ark.cn-beijing.volces.com/api/coding/v3",
|
||||
"model": "doubao-seed-code"
|
||||
}
|
||||
)
|
||||
|
||||
if result["confidence"] != "none":
|
||||
analysis = result["analysis_results"][0]["result"]
|
||||
# Use analysis in your response
|
||||
response = f"根据图片分析:{analysis}"
|
||||
else:
|
||||
# Handle as normal text-only request
|
||||
response = "处理文本请求..."
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Copy `config/settings.json.example` to `config/settings.json` and update with your API key:
|
||||
|
||||
```json
|
||||
{
|
||||
"vision_api": {
|
||||
"key": "b0359bed-09f2-49e2-a53c-32ba057412e3",
|
||||
"base_url": "https://ark.cn-beijing.volces.com/api/coding/v3",
|
||||
"model": "doubao-seed-code"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Supported Features
|
||||
|
||||
✅ **Automatic Detection**: File extensions, keywords, URLs, markdown syntax
|
||||
✅ **Multiple Analysis Modes**: OCR, chart analysis, product analysis, scene description
|
||||
✅ **Error Handling**: Graceful degradation with clear error messages
|
||||
✅ **File Support**: Local files, relative paths, absolute paths, URLs
|
||||
✅ **Format Support**: PNG, JPG, JPEG, WebP, GIF, BMP
|
||||
|
||||
## Limitations
|
||||
|
||||
⚠️ **No Custom Agent Delegation**: The `@multimodal-looker` approach doesn't work with current OMO
|
||||
⚠️ **API Key Required**: Must have valid 火山方舟 API key
|
||||
⚠️ **File Size**: Images should be < 4MB for optimal performance
|
||||
⚠️ **Network**: Requires internet access to `https://ark.cn-beijing.volces.com`
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Vision processing not working?**
|
||||
- Check `VOLCENGINE_API_KEY` configuration
|
||||
- Verify image file exists and is accessible
|
||||
- Test with simple request: "描述 image.png"
|
||||
|
||||
**Detection not triggering?**
|
||||
- Ensure input contains detectable patterns (file extensions like `.png`, keywords like "图片")
|
||||
- Use explicit file paths instead of vague references
|
||||
|
||||
**API errors?**
|
||||
- Check network connectivity to 火山方舟 API
|
||||
- Verify API key is valid
|
||||
- Check rate limits on your account
|
||||
|
||||
## Related Skills
|
||||
|
||||
- `image-service`: For image generation and editing
|
||||
- `file-reader`: For reading document contents (complementary)
|
||||
|
||||
This skill provides **fully automatic visual content processing** without requiring manual intervention or custom agent commands.
|
||||
Reference in New Issue
Block a user