commit cf10ab64732fc161f60cadf5929c9c914cba33f9 Author: hmo Date: Wed Feb 11 22:02:47 2026 +0800 Initial commit to git.yoin diff --git a/README.md b/README.md new file mode 100644 index 0000000..c4a22d6 --- /dev/null +++ b/README.md @@ -0,0 +1,26 @@ +# OpenCode Skills + +OpenCode 技能集合,扩展 AI Agent 的专业能力。 + +## 技能列表 + +| Skill | 用途 | +|-------|------| +| csv-data-summarizer | CSV 数据分析统计 | +| deep-research | 深度调研报告生成 | +| image-service | 图像生成/编辑/分析 | +| log-analyzer | 日志智能分析 | +| mcp-builder | MCP Server 创建 | +| searchnews | AI 新闻搜索整理 | +| skill-creator | Skill 创建指南和工具 | +| smart-query | 数据库智能查询 | +| story-to-scenes | 故事拆镜生图 | +| uni-agent | 统一 Agent 调度 | +| video-creator | 视频生成 | +| videocut-* | 视频剪辑系列工具 | + +## 使用方式 + +将 skill 目录复制到 `.opencode/skills/` 下即可使用。 + +各 skill 详细使用说明见对应目录的 README.md。 diff --git a/csv-data-summarizer/README.md b/csv-data-summarizer/README.md new file mode 100644 index 0000000..772b01e --- /dev/null +++ b/csv-data-summarizer/README.md @@ -0,0 +1,198 @@ +
+ +[![Join AI Community](https://img.shields.io/badge/🚀_Join-AI_Community_(FREE)-4F46E5?style=for-the-badge)](https://www.skool.com/ai-for-your-business) +[![GitHub Profile](https://img.shields.io/badge/GitHub-@coffeefuelbump-181717?style=for-the-badge&logo=github)](https://github.com/coffeefuelbump) + +[![Link Tree](https://img.shields.io/badge/Linktree-Everything-green?style=for-the-badge&logo=linktree&logoColor=white)](https://linktr.ee/corbin_brown) +[![YouTube Membership](https://img.shields.io/badge/YouTube-Become%20a%20Builder-red?style=for-the-badge&logo=youtube&logoColor=white)](https://www.youtube.com/channel/UCJFMlSxcvlZg5yZUYJT0Pug/join) + +
+ +--- + +# 📊 CSV Data Summarizer - Claude Skill + +A powerful Claude Skill that automatically analyzes CSV files and generates comprehensive insights with visualizations. Upload any CSV and get instant, intelligent analysis without being asked what you want! + +
+ +[![Version](https://img.shields.io/badge/version-2.1.0-blue.svg)](https://github.com/coffeefuelbump/csv-data-summarizer-claude-skill) +[![Python](https://img.shields.io/badge/python-3.8+-green.svg)](https://www.python.org/) +[![License](https://img.shields.io/badge/license-MIT-orange.svg)](LICENSE) + +
+ +## 🚀 Features + +- **🤖 Intelligent & Adaptive** - Automatically detects data type (sales, customer, financial, survey, etc.) and applies relevant analysis +- **📈 Comprehensive Analysis** - Generates statistics, correlations, distributions, and trends +- **🎨 Auto Visualizations** - Creates multiple charts based on what's in your data: + - Time-series plots for date-based data + - Correlation heatmaps for numeric relationships + - Distribution histograms + - Categorical breakdowns +- **⚡ Proactive** - No questions asked! Just upload CSV and get complete analysis immediately +- **🔍 Data Quality Checks** - Automatically detects and reports missing values +- **📊 Multi-Industry Support** - Adapts to e-commerce, healthcare, finance, operations, surveys, and more + +## 📥 Quick Download + +
+ +### Get Started in 2 Steps + +**1️⃣ Download the Skill** +[![Download Skill](https://img.shields.io/badge/Download-CSV%20Data%20Summarizer%20Skill-blue?style=for-the-badge&logo=download)](https://github.com/coffeefuelbump/csv-data-summarizer-claude-skill/raw/main/csv-data-summarizer.zip) + +**2️⃣ Try the Demo Data** +[![Download Demo CSV](https://img.shields.io/badge/Download-Sample%20P%26L%20Financial%20Data-green?style=for-the-badge&logo=data)](https://github.com/coffeefuelbump/csv-data-summarizer-claude-skill/raw/main/examples/showcase_financial_pl_data.csv) + +
+ +--- + +## 📦 What's Included + +``` +csv-data-summarizer-claude-skill/ +├── SKILL.md # Claude Skill definition +├── analyze.py # Comprehensive analysis engine +├── requirements.txt # Python dependencies +├── examples/ +│ └── showcase_financial_pl_data.csv # Demo P&L financial dataset (15 months, 25 metrics) +└── resources/ + ├── sample.csv # Example dataset + └── README.md # Usage documentation +``` + +## 🎯 How It Works + +1. **Upload** any CSV file to Claude.ai +2. **Skill activates** automatically when CSV is detected +3. **Analysis runs** immediately - inspects data structure and adapts +4. **Results delivered** - Complete analysis with multiple visualizations + +No prompting needed. No options to choose. Just instant, comprehensive insights! + +## 📥 Installation + +### For Claude.ai Users + +1. Download the latest release: [`csv-data-summarizer.zip`](https://github.com/coffeefuelbump/csv-data-summarizer-claude-skill/releases) +2. Go to [Claude.ai](https://claude.ai) → Settings → Capabilities → Skills +3. Upload the zip file +4. Enable the skill +5. Done! Upload any CSV and watch it work ✨ + +### For Developers + +```bash +git clone git@github.com:coffeefuelbump/csv-data-summarizer-claude-skill.git +cd csv-data-summarizer-claude-skill +pip install -r requirements.txt +``` + +## 📊 Sample Dataset Highlights + +The included demo CSV contains **15 months of P&L data** with: +- 3 product lines (SaaS, Enterprise, Services) +- 25 financial metrics including revenue, expenses, margins, CAC, LTV +- Quarterly trends showing business growth +- Perfect for showcasing time-series analysis, correlations, and financial insights + +## 🎨 Example Use Cases + +- **📊 Sales Data** → Revenue trends, product performance, regional analysis +- **👥 Customer Data** → Demographics, segmentation, geographic patterns +- **💰 Financial Data** → Transaction analysis, trend detection, correlations +- **⚙️ Operational Data** → Performance metrics, time-series analysis +- **📋 Survey Data** → Response distributions, cross-tabulations + +## 🛠️ Technical Details + +**Dependencies:** +- Python 3.8+ +- pandas 2.0+ +- matplotlib 3.7+ +- seaborn 0.12+ + +**Visualizations Generated:** +- Time-series trend plots +- Correlation heatmaps +- Distribution histograms +- Categorical bar charts + +## 📝 Example Output + +``` +============================================================ +📊 DATA OVERVIEW +============================================================ +Rows: 100 | Columns: 15 + +📋 DATA TYPES: + • order_date: object + • total_revenue: float64 + • customer_segment: object + ... + +🔍 DATA QUALITY: +✓ No missing values - dataset is complete! + +📈 NUMERICAL ANALYSIS: +[Summary statistics for all numeric columns] + +🔗 CORRELATIONS: +[Correlation matrix showing relationships] + +📅 TIME SERIES ANALYSIS: +Date range: 2024-01-05 to 2024-04-11 +Span: 97 days + +📊 VISUALIZATIONS CREATED: + ✓ correlation_heatmap.png + ✓ time_series_analysis.png + ✓ distributions.png + ✓ categorical_distributions.png +``` + +## 🌟 Connect & Learn More + +
+ +[![Join AI Community](https://img.shields.io/badge/Join-AI%20Community%20(FREE)-blue?style=for-the-badge&logo=)](https://www.skool.com/ai-for-your-business/about) + +[![Link Tree](https://img.shields.io/badge/Linktree-Everything-green?style=for-the-badge&logo=linktree&logoColor=white)](https://linktr.ee/corbin_brown) + +[![YouTube Membership](https://img.shields.io/badge/YouTube-Become%20a%20Builder-red?style=for-the-badge&logo=youtube&logoColor=white)](https://www.youtube.com/channel/UCJFMlSxcvlZg5yZUYJT0Pug/join) + +[![Twitter Follow](https://img.shields.io/badge/Twitter-Follow%20@corbin__braun-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white)](https://twitter.com/corbin_braun) + +
+ +## 🤝 Contributing + +Contributions are welcome! Feel free to: +- Report bugs +- Suggest new features +- Submit pull requests +- Share your use cases + +## 📄 License + +MIT License - feel free to use this skill for personal or commercial projects! + +## 🙏 Acknowledgments + +Built for the Claude Skills platform by [Anthropic](https://www.anthropic.com/news/skills). + +--- + +
+ +**Made with ❤️ for the AI community** + +⭐ Star this repo if you find it useful! + +
+ diff --git a/csv-data-summarizer/SKILL.md b/csv-data-summarizer/SKILL.md new file mode 100644 index 0000000..c55e7f8 --- /dev/null +++ b/csv-data-summarizer/SKILL.md @@ -0,0 +1,148 @@ +--- +name: csv-data-summarizer +description: CSV数据分析技能。使用Python和pandas分析CSV文件,生成统计摘要和快速可视化图表。当用户上传或提到CSV文件、需要分析表格数据时自动使用。 +metadata: + version: "2.1.0" + dependencies: python>=3.8, pandas>=2.0.0, matplotlib>=3.7.0, seaborn>=0.12.0 +--- + +# CSV 数据分析器 + +此技能分析 CSV 文件并提供包含统计洞察和可视化的全面摘要。 + +## 何时使用此技能 + +当用户: +- 上传或提到 CSV 文件 +- 要求汇总、分析或可视化表格数据 +- 请求从 CSV 数据中获取洞察 +- 想了解数据结构和质量 + +## 工作原理 + +## ⚠️ 关键行为要求 ⚠️ + +**不要问用户想用数据做什么。** +**不要提供选项或选择。** +**不要说"您想让我帮您做什么?"** +**不要列出可能的分析选项。** + +**立即自动执行:** +1. 运行全面分析 +2. 生成所有相关可视化 +3. 展示完整结果 +4. 不提问、不给选项、不等待用户输入 + +**用户想要立即获得完整分析 - 直接做就行。** + +### 自动分析步骤: + +**该技能通过先检查数据,然后确定最相关的分析,智能适应不同的数据类型和行业。** + +1. **加载并检查** CSV 文件到 pandas DataFrame +2. **识别数据结构** - 列类型、日期列、数值列、类别 +3. **根据数据内容确定相关分析**: + - **销售/电商数据**(订单日期、收入、产品):时间序列趋势、收入分析、产品表现 + - **客户数据**(人口统计、细分、区域):分布分析、细分、地理模式 + - **财务数据**(交易、金额、日期):趋势分析、统计摘要、相关性 + - **运营数据**(时间戳、指标、状态):时间序列、绩效指标、分布 + - **调查数据**(分类响应、评分):频率分析、交叉表、分布 + - **通用表格数据**:根据找到的列类型调整 + +4. **只创建对特定数据集有意义的可视化**: + - 时间序列图仅在存在日期/时间戳列时 + - 相关性热图仅在存在多个数值列时 + - 类别分布仅在存在分类列时 + - 数值分布的直方图(相关时) + +5. **自动生成全面输出**包括: + - 数据概览(行数、列数、类型) + - 与数据类型相关的关键统计和指标 + - 缺失数据分析 + - 多个相关可视化(仅适用的那些) + - 基于此特定数据集中发现的模式的可操作洞察 + +6. **一次性展示所有内容** - 不追问 + +**适应示例:** +- 带患者ID的医疗数据 → 专注于人口统计、治疗模式、时间趋势 +- 带库存水平的库存数据 → 专注于数量分布、补货模式、SKU分析 +- 带时间戳的网站分析 → 专注于流量模式、转化指标、时段分析 +- 调查响应 → 专注于响应分布、人口统计细分、情感模式 + +### 行为指南 + +✅ **正确方法 - 这样说:** +- "我现在对这些数据进行全面分析。" +- "这是带可视化的完整分析:" +- "我识别出这是[类型]数据并生成了相关洞察:" +- 然后立即展示完整分析 + +✅ **要做:** +- 立即运行分析脚本 +- 自动生成所有相关图表 +- 无需询问即提供完整洞察 +- 在第一次响应中就做到全面完整 +- 果断行动,不需征求许可 + +❌ **永远不要说这些话:** +- "您想用这些数据做什么?" +- "您想让我帮您做什么?" +- "这里有一些常见选项:" +- "让我知道您想要什么帮助" +- "如果您愿意,我可以创建全面分析!" +- 任何以"?"结尾询问用户方向的句子 +- 任何选项或选择列表 +- 任何条件性的"如果您想,我可以做X" + +❌ **禁止行为:** +- 询问用户想要什么 +- 列出选项供用户选择 +- 在分析前等待用户指示 +- 提供需要后续跟进的部分分析 +- 描述你可以做什么而不是直接做 + +### 使用方法 + +该技能提供 Python 函数 `summarize_csv(file_path)`: +- 接受 CSV 文件的路径 +- 返回带统计信息的全面文本摘要 +- 根据数据结构自动生成多个可视化 + +### 示例提示 + +> "这是 `sales_data.csv`。你能汇总这个文件吗?" + +> "分析这个客户数据 CSV 并展示趋势。" + +> "你能从 `orders.csv` 中发现什么洞察?" + +### 示例输出 + +**数据集概览** +- 5,000 行 × 8 列 +- 3 个数值列,1 个日期列 + +**统计摘要** +- 平均订单价值:$58.2 +- 标准差:$12.4 +- 缺失值:2%(100个单元格) + +**洞察** +- 销售随时间呈上升趋势 +- Q4活动达到峰值 +*(附:趋势图)* + +## 文件 + +- `analyze.py` - 核心分析逻辑 +- `requirements.txt` - Python 依赖 +- `resources/sample.csv` - 用于测试的示例数据集 +- `resources/README.md` - 附加文档 + +## 注意事项 + +- 自动检测日期列(名称中包含 'date' 的列) +- 优雅处理缺失数据 +- 仅在存在日期列时生成可视化 +- 所有数值列都包含在统计摘要中 diff --git a/csv-data-summarizer/analyze.py b/csv-data-summarizer/analyze.py new file mode 100644 index 0000000..5931524 --- /dev/null +++ b/csv-data-summarizer/analyze.py @@ -0,0 +1,182 @@ +import pandas as pd +import matplotlib.pyplot as plt +import seaborn as sns +from pathlib import Path + +def summarize_csv(file_path): + """ + Comprehensively analyzes a CSV file and generates multiple visualizations. + + Args: + file_path (str): Path to the CSV file + + Returns: + str: Formatted comprehensive analysis of the dataset + """ + df = pd.read_csv(file_path) + summary = [] + charts_created = [] + + # Basic info + summary.append("=" * 60) + summary.append("📊 DATA OVERVIEW") + summary.append("=" * 60) + summary.append(f"Rows: {df.shape[0]:,} | Columns: {df.shape[1]}") + summary.append(f"\nColumns: {', '.join(df.columns.tolist())}") + + # Data types + summary.append(f"\n📋 DATA TYPES:") + for col, dtype in df.dtypes.items(): + summary.append(f" • {col}: {dtype}") + + # Missing data analysis + missing = df.isnull().sum().sum() + missing_pct = (missing / (df.shape[0] * df.shape[1])) * 100 + summary.append(f"\n🔍 DATA QUALITY:") + if missing: + summary.append(f"Missing values: {missing:,} ({missing_pct:.2f}% of total data)") + summary.append("Missing by column:") + for col in df.columns: + col_missing = df[col].isnull().sum() + if col_missing > 0: + col_pct = (col_missing / len(df)) * 100 + summary.append(f" • {col}: {col_missing:,} ({col_pct:.1f}%)") + else: + summary.append("✓ No missing values - dataset is complete!") + + # Numeric analysis + numeric_cols = df.select_dtypes(include='number').columns.tolist() + if numeric_cols: + summary.append(f"\n📈 NUMERICAL ANALYSIS:") + summary.append(str(df[numeric_cols].describe())) + + # Correlations if multiple numeric columns + if len(numeric_cols) > 1: + summary.append(f"\n🔗 CORRELATIONS:") + corr_matrix = df[numeric_cols].corr() + summary.append(str(corr_matrix)) + + # Create correlation heatmap + plt.figure(figsize=(10, 8)) + sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', center=0, + square=True, linewidths=1) + plt.title('Correlation Heatmap') + plt.tight_layout() + plt.savefig('correlation_heatmap.png', dpi=150) + plt.close() + charts_created.append('correlation_heatmap.png') + + # Categorical analysis + categorical_cols = df.select_dtypes(include=['object']).columns.tolist() + categorical_cols = [c for c in categorical_cols if 'id' not in c.lower()] + + if categorical_cols: + summary.append(f"\n📊 CATEGORICAL ANALYSIS:") + for col in categorical_cols[:5]: # Limit to first 5 + value_counts = df[col].value_counts() + summary.append(f"\n{col}:") + for val, count in value_counts.head(10).items(): + pct = (count / len(df)) * 100 + summary.append(f" • {val}: {count:,} ({pct:.1f}%)") + + # Time series analysis + date_cols = [c for c in df.columns if 'date' in c.lower() or 'time' in c.lower()] + if date_cols: + summary.append(f"\n📅 TIME SERIES ANALYSIS:") + date_col = date_cols[0] + df[date_col] = pd.to_datetime(df[date_col], errors='coerce') + + date_range = df[date_col].max() - df[date_col].min() + summary.append(f"Date range: {df[date_col].min()} to {df[date_col].max()}") + summary.append(f"Span: {date_range.days} days") + + # Create time-series plots for numeric columns + if numeric_cols: + fig, axes = plt.subplots(min(3, len(numeric_cols)), 1, + figsize=(12, 4 * min(3, len(numeric_cols)))) + if len(numeric_cols) == 1: + axes = [axes] + + for idx, num_col in enumerate(numeric_cols[:3]): + ax = axes[idx] if len(numeric_cols) > 1 else axes[0] + daily_data = df.groupby(date_col)[num_col].agg(['mean', 'sum', 'count']) + daily_data['mean'].plot(ax=ax, label='Average', linewidth=2) + ax.set_title(f'{num_col} Over Time') + ax.set_xlabel('Date') + ax.set_ylabel(num_col) + ax.legend() + ax.grid(True, alpha=0.3) + + plt.tight_layout() + plt.savefig('time_series_analysis.png', dpi=150) + plt.close() + charts_created.append('time_series_analysis.png') + + # Distribution plots for numeric columns + if numeric_cols: + n_cols = min(4, len(numeric_cols)) + fig, axes = plt.subplots(2, 2, figsize=(12, 10)) + axes = axes.flatten() + + for idx, col in enumerate(numeric_cols[:4]): + axes[idx].hist(df[col].dropna(), bins=30, edgecolor='black', alpha=0.7) + axes[idx].set_title(f'Distribution of {col}') + axes[idx].set_xlabel(col) + axes[idx].set_ylabel('Frequency') + axes[idx].grid(True, alpha=0.3) + + # Hide unused subplots + for idx in range(len(numeric_cols[:4]), 4): + axes[idx].set_visible(False) + + plt.tight_layout() + plt.savefig('distributions.png', dpi=150) + plt.close() + charts_created.append('distributions.png') + + # Categorical distributions + if categorical_cols: + fig, axes = plt.subplots(2, 2, figsize=(14, 10)) + axes = axes.flatten() + + for idx, col in enumerate(categorical_cols[:4]): + value_counts = df[col].value_counts().head(10) + axes[idx].barh(range(len(value_counts)), value_counts.values) + axes[idx].set_yticks(range(len(value_counts))) + axes[idx].set_yticklabels(value_counts.index) + axes[idx].set_title(f'Top Values in {col}') + axes[idx].set_xlabel('Count') + axes[idx].grid(True, alpha=0.3, axis='x') + + # Hide unused subplots + for idx in range(len(categorical_cols[:4]), 4): + axes[idx].set_visible(False) + + plt.tight_layout() + plt.savefig('categorical_distributions.png', dpi=150) + plt.close() + charts_created.append('categorical_distributions.png') + + # Summary of visualizations + if charts_created: + summary.append(f"\n📊 VISUALIZATIONS CREATED:") + for chart in charts_created: + summary.append(f" ✓ {chart}") + + summary.append("\n" + "=" * 60) + summary.append("✅ COMPREHENSIVE ANALYSIS COMPLETE") + summary.append("=" * 60) + + return "\n".join(summary) + + +if __name__ == "__main__": + # Test with sample data + import sys + if len(sys.argv) > 1: + file_path = sys.argv[1] + else: + file_path = "resources/sample.csv" + + print(summarize_csv(file_path)) + diff --git a/csv-data-summarizer/examples/showcase_financial_pl_data.csv b/csv-data-summarizer/examples/showcase_financial_pl_data.csv new file mode 100644 index 0000000..395b0e5 --- /dev/null +++ b/csv-data-summarizer/examples/showcase_financial_pl_data.csv @@ -0,0 +1,46 @@ +month,year,quarter,product_line,total_revenue,cost_of_goods_sold,gross_profit,gross_margin_pct,marketing_expense,sales_expense,rd_expense,admin_expense,total_operating_expenses,operating_income,operating_margin_pct,interest_expense,tax_expense,net_income,net_margin_pct,customer_acquisition_cost,customer_lifetime_value,units_sold,avg_selling_price,headcount,revenue_per_employee +Jan,2023,Q1,SaaS Platform,450000,135000,315000,70.0,65000,85000,45000,35000,230000,85000,18.9,5000,16000,64000,14.2,125,2400,1200,375,45,10000 +Jan,2023,Q1,Enterprise Solutions,280000,112000,168000,60.0,35000,55000,25000,20000,135000,33000,11.8,3000,6600,23400,8.4,450,8500,450,622,45,6222 +Jan,2023,Q1,Professional Services,125000,50000,75000,60.0,15000,22000,8000,12000,57000,18000,14.4,1500,3600,12900,10.3,200,3200,95,1316,45,2778 +Feb,2023,Q1,SaaS Platform,475000,142500,332500,70.0,68000,89000,47000,36000,240000,92500,19.5,5200,18500,68800,14.5,120,2500,1300,365,47,10106 +Feb,2023,Q1,Enterprise Solutions,295000,118000,177000,60.0,38000,58000,27000,22000,145000,32000,10.8,3200,6400,22400,7.6,440,8600,470,628,47,6277 +Feb,2023,Q1,Professional Services,135000,54000,81000,60.0,16000,24000,9000,13000,62000,19000,14.1,1600,3800,13600,10.1,195,3300,105,1286,47,2872 +Mar,2023,Q1,SaaS Platform,520000,156000,364000,70.0,75000,95000,52000,40000,262000,102000,19.6,5500,19250,77250,14.9,115,2650,1450,359,50,10400 +Mar,2023,Q1,Enterprise Solutions,325000,130000,195000,60.0,42000,63000,30000,25000,160000,35000,10.8,3500,7000,24500,7.5,425,8800,520,625,50,6500 +Mar,2023,Q1,Professional Services,148000,59200,88800,60.0,18000,26000,10000,14000,68000,20800,14.1,1800,4160,14840,10.0,190,3400,115,1287,50,2960 +Apr,2023,Q2,SaaS Platform,555000,166500,388500,70.0,80000,100000,55000,42000,277000,111500,20.1,5800,22300,83400,15.0,110,2750,1550,358,52,10673 +Apr,2023,Q2,Enterprise Solutions,340000,136000,204000,60.0,45000,65000,32000,26000,168000,36000,10.6,3700,7200,25100,7.4,420,9000,540,630,52,6538 +Apr,2023,Q2,Professional Services,158000,63200,94800,60.0,19000,27000,11000,15000,72000,22800,14.4,1900,4560,16340,10.3,185,3500,125,1264,52,3038 +May,2023,Q2,SaaS Platform,590000,177000,413000,70.0,85000,105000,58000,44000,292000,121000,20.5,6000,24200,90800,15.4,105,2850,1650,358,55,10727 +May,2023,Q2,Enterprise Solutions,365000,146000,219000,60.0,48000,68000,35000,28000,179000,40000,11.0,4000,8000,28000,7.7,410,9200,580,629,55,6636 +May,2023,Q2,Professional Services,172000,68800,103200,60.0,21000,29000,12000,16000,78000,25200,14.7,2100,5040,18060,10.5,180,3600,135,1274,55,3127 +Jun,2023,Q2,SaaS Platform,625000,187500,437500,70.0,90000,110000,62000,46000,308000,129500,20.7,6200,25850,97450,15.6,100,2950,1750,357,58,10776 +Jun,2023,Q2,Enterprise Solutions,385000,154000,231000,60.0,50000,70000,37000,29000,186000,45000,11.7,4200,9000,31800,8.3,400,9400,610,631,58,6638 +Jun,2023,Q2,Professional Services,185000,74000,111000,60.0,22000,31000,13000,17000,83000,28000,15.1,2200,5580,20220,10.9,175,3700,145,1276,58,3190 +Jul,2023,Q3,SaaS Platform,665000,199500,465500,70.0,95000,115000,65000,48000,323000,142500,21.4,6500,28500,107500,16.2,95,3050,1850,359,60,11083 +Jul,2023,Q3,Enterprise Solutions,410000,164000,246000,60.0,53000,73000,40000,31000,197000,49000,12.0,4400,9800,34800,8.5,390,9600,650,631,60,6833 +Jul,2023,Q3,Professional Services,198000,79200,118800,60.0,24000,33000,14000,18000,89000,29800,15.1,2400,5960,21440,10.8,170,3800,155,1277,60,3300 +Aug,2023,Q3,SaaS Platform,705000,211500,493500,70.0,100000,120000,68000,50000,338000,155500,22.1,6800,31100,117600,16.7,90,3150,1950,362,63,11190 +Aug,2023,Q3,Enterprise Solutions,435000,174000,261000,60.0,56000,76000,42000,33000,207000,54000,12.4,4600,10800,38600,8.9,380,9800,690,630,63,6905 +Aug,2023,Q3,Professional Services,210000,84000,126000,60.0,25000,35000,15000,19000,94000,32000,15.2,2500,6400,23100,11.0,165,3900,165,1273,63,3333 +Sep,2023,Q3,SaaS Platform,750000,225000,525000,70.0,108000,128000,72000,53000,361000,164000,21.9,7200,33360,123440,16.5,88,3250,2080,360,65,11538 +Sep,2023,Q3,Enterprise Solutions,465000,186000,279000,60.0,60000,80000,45000,35000,220000,59000,12.7,5000,11800,42200,9.1,370,10000,735,633,65,7154 +Sep,2023,Q3,Professional Services,225000,90000,135000,60.0,27000,37000,16000,20000,100000,35000,15.6,2700,6920,25380,11.3,160,4000,175,1286,65,3462 +Oct,2023,Q4,SaaS Platform,795000,238500,556500,70.0,115000,135000,75000,55000,380000,176500,22.2,7500,35870,133130,16.7,85,3350,2200,361,68,11691 +Oct,2023,Q4,Enterprise Solutions,490000,196000,294000,60.0,63000,83000,47000,36000,229000,65000,13.3,5200,13000,46800,9.6,360,10200,770,636,68,7206 +Oct,2023,Q4,Professional Services,238000,95200,142800,60.0,29000,39000,17000,21000,106000,36800,15.5,2800,7360,26640,11.2,158,4100,185,1286,68,3500 +Nov,2023,Q4,SaaS Platform,840000,252000,588000,70.0,122000,142000,78000,58000,400000,188000,22.4,7800,38440,141760,16.9,82,3450,2320,362,70,12000 +Nov,2023,Q4,Enterprise Solutions,520000,208000,312000,60.0,67000,87000,50000,38000,242000,70000,13.5,5500,14100,50400,9.7,355,10400,815,638,70,7429 +Nov,2023,Q4,Professional Services,252000,100800,151200,60.0,31000,41000,18000,22000,112000,39200,15.6,3000,7728,28472,11.3,155,4200,195,1292,70,3600 +Dec,2023,Q4,SaaS Platform,895000,268500,626500,70.0,130000,150000,82000,62000,424000,202500,22.6,8200,41145,153155,17.1,80,3550,2480,361,72,12431 +Dec,2023,Q4,Enterprise Solutions,555000,222000,333000,60.0,72000,92000,53000,40000,257000,76000,13.7,6000,15400,54600,9.8,350,10600,870,638,72,7708 +Dec,2023,Q4,Professional Services,268000,107200,160800,60.0,33000,43000,19000,23000,118000,42800,16.0,3200,8352,31248,11.7,152,4300,205,1307,72,3722 +Jan,2024,Q1,SaaS Platform,925000,277500,647500,70.0,135000,155000,85000,64000,439000,208500,22.5,8500,42070,157930,17.1,78,3650,2550,363,75,12333 +Jan,2024,Q1,Enterprise Solutions,575000,230000,345000,60.0,75000,95000,55000,42000,267000,78000,13.6,6200,15760,56040,9.7,345,10800,900,639,75,7667 +Jan,2024,Q1,Professional Services,280000,112000,168000,60.0,34000,45000,20000,24000,123000,45000,16.1,3300,8770,32930,11.8,150,4400,215,1302,75,3733 +Feb,2024,Q1,SaaS Platform,965000,289500,675500,70.0,140000,160000,88000,66000,454000,221500,23.0,8800,44510,168190,17.4,75,3750,2660,363,77,12532 +Feb,2024,Q1,Enterprise Solutions,600000,240000,360000,60.0,78000,98000,57000,43000,276000,84000,14.0,6400,16800,60800,10.1,340,11000,940,638,77,7792 +Feb,2024,Q1,Professional Services,295000,118000,177000,60.0,36000,47000,21000,25000,129000,48000,16.3,3500,9420,35080,11.9,148,4500,225,1311,77,3831 +Mar,2024,Q1,SaaS Platform,1020000,306000,714000,70.0,148000,168000,92000,69000,477000,237000,23.2,9200,47880,179920,17.6,73,3850,2810,363,80,12750 +Mar,2024,Q1,Enterprise Solutions,635000,254000,381000,60.0,82000,103000,60000,45000,290000,91000,14.3,6800,18200,66000,10.4,335,11200,990,641,80,7938 +Mar,2024,Q1,Professional Services,312000,124800,187200,60.0,38000,49000,22000,26000,135000,52200,16.7,3700,10230,38270,12.3,145,4600,240,1300,80,3900 diff --git a/csv-data-summarizer/requirements.txt b/csv-data-summarizer/requirements.txt new file mode 100644 index 0000000..24647d8 --- /dev/null +++ b/csv-data-summarizer/requirements.txt @@ -0,0 +1,4 @@ +pandas>=2.0.0 +matplotlib>=3.7.0 +seaborn>=0.12.0 + diff --git a/csv-data-summarizer/resources/README.md b/csv-data-summarizer/resources/README.md new file mode 100644 index 0000000..6e9c613 --- /dev/null +++ b/csv-data-summarizer/resources/README.md @@ -0,0 +1,83 @@ +# CSV Data Summarizer - Resources + +--- + +## 🌟 Connect & Learn More + +
+ +### 🚀 **Join Our Community** +[![Join AI Community](https://img.shields.io/badge/Join-AI%20Community%20(FREE)-blue?style=for-the-badge&logo=)](https://www.skool.com/ai-for-your-business/about) + +### 🔗 **All My Links** +[![Link Tree](https://img.shields.io/badge/Linktree-Everything-green?style=for-the-badge&logo=linktree&logoColor=white)](https://linktr.ee/corbin_brown) + +### 🛠️ **Become a Builder** +[![YouTube Membership](https://img.shields.io/badge/YouTube-Become%20a%20Builder-red?style=for-the-badge&logo=youtube&logoColor=white)](https://www.youtube.com/channel/UCJFMlSxcvlZg5yZUYJT0Pug/join) + +### 🐦 **Follow on Twitter** +[![Twitter Follow](https://img.shields.io/badge/Twitter-Follow%20@corbin__braun-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white)](https://twitter.com/corbin_braun) + +
+ +--- + +## Sample Data + +The `sample.csv` file contains example sales data with the following columns: + +- **date**: Transaction date +- **product**: Product name (Widget A, B, or C) +- **quantity**: Number of items sold +- **revenue**: Total revenue from the transaction +- **customer_id**: Unique customer identifier +- **region**: Geographic region (North, South, East, West) + +## Usage Examples + +### Basic Summary +``` +Analyze sample.csv +``` + +### With Custom CSV +``` +Here's my sales_data.csv file. Can you summarize it? +``` + +### Focus on Specific Insights +``` +What are the revenue trends in this dataset? +``` + +## Testing the Skill + +You can test the skill locally before uploading to Claude: + +```bash +# Install dependencies +pip install -r ../requirements.txt + +# Run the analysis +python ../analyze.py sample.csv +``` + +## Expected Output + +The analysis will provide: + +1. **Dataset dimensions** - Row and column counts +2. **Column information** - Names and data types +3. **Summary statistics** - Mean, median, std dev, min/max for numeric columns +4. **Data quality** - Missing value detection and counts +5. **Visualizations** - Time-series plots when date columns are present + +## Customization + +To adapt this skill for your specific use case: + +1. Modify `analyze.py` to include domain-specific calculations +2. Add custom visualization types in the plotting section +3. Include validation rules specific to your data +4. Add more sample datasets to test different scenarios + diff --git a/csv-data-summarizer/resources/sample.csv b/csv-data-summarizer/resources/sample.csv new file mode 100644 index 0000000..348a814 --- /dev/null +++ b/csv-data-summarizer/resources/sample.csv @@ -0,0 +1,22 @@ +date,product,quantity,revenue,customer_id,region +2024-01-15,Widget A,5,129.99,C001,North +2024-01-16,Widget B,3,89.97,C002,South +2024-01-17,Widget A,7,181.98,C003,East +2024-01-18,Widget C,2,199.98,C001,North +2024-01-19,Widget B,4,119.96,C004,West +2024-01-20,Widget A,6,155.94,C005,South +2024-01-21,Widget C,1,99.99,C002,South +2024-01-22,Widget B,8,239.92,C006,East +2024-01-23,Widget A,3,77.97,C007,North +2024-01-24,Widget C,5,499.95,C003,East +2024-01-25,Widget B,2,59.98,C008,West +2024-01-26,Widget A,9,233.91,C004,West +2024-01-27,Widget C,3,299.97,C009,North +2024-01-28,Widget B,6,179.94,C010,South +2024-01-29,Widget A,4,103.96,C005,South +2024-01-30,Widget C,7,699.93,C011,East +2024-01-31,Widget B,5,149.95,C012,West +2024-02-01,Widget A,8,207.92,C013,North +2024-02-02,Widget C,2,199.98,C014,South +2024-02-03,Widget B,10,299.90,C015,East + diff --git a/deep-research/README.md b/deep-research/README.md new file mode 100644 index 0000000..0dad1c3 --- /dev/null +++ b/deep-research/README.md @@ -0,0 +1,18 @@ +# Deep Research + +深度调研技能,对技术主题进行调研,输出 Markdown + Word 报告。 + +## 依赖 + +```bash +# 必需 +brew install pandoc +pip install python-docx + +# 可选(生成配图) +# 需要 image-service skill 并配置 API Key +``` + +## 使用 + +加载 skill 后,直接告诉 Agent 要调研的主题即可。 diff --git a/deep-research/SKILL.md b/deep-research/SKILL.md new file mode 100644 index 0000000..69b3578 --- /dev/null +++ b/deep-research/SKILL.md @@ -0,0 +1,346 @@ +--- +name: deep-research +description: 当用户要求"调研"、"深度调研"、"帮我研究"、"调研下这个",或提到需要搜索、整理、汇总指定主题的技术内容时,应使用此技能。 +metadata: + version: "1.0.0" +--- + +# 深度调研技能(Deep Research Skill) + +## 技能概述 + +此技能用于对技术主题进行深度调研,输出专业的调研报告文档。 + +| 能力 | 说明 | +|-----|------| +| 内容提取 | 从 URL、文档中提取核心信息 | +| 深度调研 | 联网搜索补充背景、对比、最新进展 | +| 报告生成 | **默认生成 Markdown 和 Word 两个版本** | +| 图解生成 | 为核心概念生成技术信息图 | +| Word 格式化 | 自动处理目录、标题加粗、表格实线等样式 | + +## 触发规则 + +当用户消息包含以下关键词时使用此技能: +- 调研、深度调研、调研报告 +- 帮我研究、帮我分析 +- 调研下这个、看看这个 + +## 输出规范 + +每次调研任务必须同时提供: +1. **Markdown 版本**:用于 Obsidian 知识库沉淀和双链关联 +2. **Word 版本**:用于正式汇报和外部分享,需经过脚本格式化处理 + +## 目录结构 + +每个调研主题创建独立文件夹,保持整洁: + +``` +{output_dir}/ +├── Ralph-Loop/ # 主题文件夹(英文短横线命名) +│ ├── images/ # 该主题的信息图 +│ │ ├── architecture.png +│ │ └── comparison.png +│ ├── Ralph-Loop调研报告.md # Markdown 报告 +│ └── Ralph-Loop调研报告.docx # Word 报告 +├── MCP-Protocol/ +│ ├── images/ +│ ├── MCP-Protocol调研报告.md +│ └── MCP-Protocol调研报告.docx +└── ... +``` + +命名规范: +- 文件夹名:英文,单词间用短横线连接,如 `Ralph-Loop`、`MCP-Protocol` +- 报告文件:`{主题名}调研报告.md` 和 `{主题名}调研报告.docx` +- 图片目录:每个主题文件夹下单独的 `images/` 目录 + +## 调研流程 + +### 第一步:创建主题目录 + +根据调研主题创建独立文件夹: + +```bash +mkdir -p "{output_dir}/{主题名}/images" +``` + +### 第二步:内容获取 + +1. 如果用户提供 URL,使用 webfetch 获取内容 +2. 提炼核心概念、技术原理、关键信息 +3. 识别需要深入调研的点 + +### 第三步:深度调研 + +使用 Task 工具进行联网搜索,补充: +- 技术背景和发展历程 +- 竞品对比和差异化 +- 社区讨论和实际案例 +- GitHub 仓库和开源实现 +- 最新进展和趋势 + +### 第四步:图解生成 + +使用预设风格脚本生成统一手绘风格的信息图。 + +#### 生图触发规则 + +| 内容类型 | 是否生图 | 图解类型 | 说明 | +|---------|---------|---------|------| +| 核心架构/原理 | 必须 | arch | 系统结构、技术栈、模块组成 | +| 流程/步骤 | 必须 | flow | 工作流、执行顺序、操作步骤 | +| A vs B 对比 | 必须 | compare | 两种方案/技术的对比 | +| 3个以上要素 | 建议 | concept | 核心概念、多个方面组成 | +| 纯文字表格 | 不需要 | - | 用 Markdown 表格即可 | +| 代码示例 | 不需要 | - | 用代码块即可 | + +#### 预设风格模板 + +所有配图统一使用手绘体可视化风格,保持系列一致性: + +| 类型 | 命令参数 | 配色 | 布局 | +|------|---------|------|------| +| 架构图 | `-t arch` | 科技蓝 #4A90D9 | 分层/模块化 | +| 流程图 | `-t flow` | 蓝+绿+橙 | 从上到下 | +| 对比图 | `-t compare` | 蓝 vs 橙 | 左右分栏 | +| 概念图 | `-t concept` | 蓝紫渐变 | 中心发散 | + +#### 生成命令 + +使用 `research_image.py` 脚本生成: + +```bash +# 架构图 +python .opencode/skills/image-service/scripts/research_image.py \ + -t arch \ + -n "Ralph Loop 核心架构" \ + -c "展示 Prompt、Agent、Stop Hook、Files 四个模块的循环关系" \ + -o "{output_dir}/{主题名}/images/architecture.png" + +# 流程图 +python .opencode/skills/image-service/scripts/research_image.py \ + -t flow \ + -n "Stop Hook 工作流程" \ + -c "Agent尝试退出、Hook触发、检查条件、允许或阻止退出的完整流程" \ + -o "{output_dir}/{主题名}/images/flow.png" + +# 对比图 +python .opencode/skills/image-service/scripts/research_image.py \ + -t compare \ + -n "ReAct vs Ralph Loop" \ + -c "左侧ReAct依赖自我评估停止,右侧Ralph使用外部Hook控制" \ + -o "{output_dir}/{主题名}/images/comparison.png" + +# 概念图 +python .opencode/skills/image-service/scripts/research_image.py \ + -t concept \ + -n "状态持久化要素" \ + -c "中心是Agent,周围是progress.txt、prd.json、Git历史、代码文件" \ + -o "{output_dir}/{主题名}/images/concept.png" +``` + +#### 图片命名规范 + +| 图解类型 | 文件名 | +|---------|--------| +| 架构图 | `architecture.png` 或 `{具体名称}_arch.png` | +| 流程图 | `flow.png` 或 `{具体名称}_flow.png` | +| 对比图 | `comparison.png` 或 `{A}_vs_{B}.png` | +| 概念图 | `concept.png` 或 `{具体名称}_concept.png` | + +### 第五步:报告撰写 + +按标准模板撰写 Markdown 报告,存放到主题文件夹: + +``` +{output_dir}/{主题名}/{主题名}调研报告.md +``` + +报告中引用图片使用相对路径: +```markdown +![架构图](images/architecture.png) +``` + +### 第六步:Word 导出 + +```bash +# 进入主题目录 +cd "{output_dir}/{主题名}" + +# 生成 Word(--resource-path=. 确保图片正确引用) +# 注意:不要使用 --toc 参数,因为 Markdown 中已有手写目录 +pandoc "{主题名}调研报告.md" -o "{主题名}调研报告.docx" --resource-path=. + +# 格式化 Word +python ../../../.opencode/skills/deep-research/scripts/format_docx.py "{主题名}调研报告.docx" +``` + +## 写作原则 + +调研报告的核心价值:深入研究、降低团队吸收成本、提供专家级建议。 + +1. 理解透彻:不能一知半解或大段拷贝,必须消化吸收后用自己的话表达 +2. 体现思考:有判断、有建议,而非仅仅陈述现状 +3. 细节佐证:有过程和细节支撑结论,不空谈 +4. 逻辑清晰:有分段、有结构、有编号 +5. 配图说明:核心概念必须配信息图 +6. 去除 AI 味: + - 不使用「」、" " 等特殊符号 + - 不用过多强调符号和 emoji + - 行文自然流畅,像人写的专业文档 + - 避免"首先、其次、总之"等套话 + +## 报告模板 + +```markdown +--- +date: YYYY-MM-DD +type: 调研报告 +领域: {技术领域} +tags: [调研, {主题关键词}] +--- + +# XX调研报告 + +> 调研日期:YYYY年M月D日 + +--- + +## 目录 + +- 一、简介 +- 二、启示 +- 三、核心介绍 + - 3.1 XXX + - 3.2 XXX +- 四、附录 + - 4.1 详细文档 + - 4.2 参考资料 + +--- + +## 一、简介 + +(快速说明调研内容,简短重点) + +是什么,主要用来做什么,属于什么类别。有哪些能力,有什么特点。和竞品相比,有哪些区别,主打什么。 + +1. 要点一 +2. 要点二 +3. 要点三 + +--- + +## 二、启示 + +(调研内容带来的启示、值得学习借鉴之处、与现有产品如何结合、是否值得推荐) + +1. 启示一 +2. 启示二 +3. 启示三 + +--- + +## 三、核心介绍 + +(正文部分,详细说明调研内容的原理/搭建/操作/使用过程,含信息图及流程说明) + +### 3.1 XXX + +![图解说明](images/xxx.png) + +上图展示了...(图解说明,让读者看图就能理解) + +详细内容... + +### 3.2 XXX + +详细内容... + +--- + +## 四、附录 + +### 4.1 详细文档 + +(更详细的配置/操作过程) + +### 4.2 参考资料 + +**官方文档** + +- 文档名称: https://xxx + +**开源实现** + +- 项目名称: https://github.com/xxx + +**社区讨论** + +- 讨论来源: https://xxx +``` + +## 脚本说明 + +### format_docx.py + +Word 文档格式化脚本,功能包括: + +1. 标题居中,黑色字体(去除 pandoc 默认蓝色) +2. "Table of Contents" 替换为中文"目录" +3. 目录页单独一页 +4. 一级标题(简介、启示等)前自动分页 +5. 表格保持完整不跨页断开 +6. 代码块保持完整不断开 +7. 日期行居中 + +用法: +```bash +python .opencode/skills/deep-research/scripts/format_docx.py "输入.docx" ["输出.docx"] +``` + +## 完整调研示例 + +用户输入: +> 调研下 Ralph Loop + +执行流程: + +```bash +# 1. 创建主题目录 +mkdir -p "{output_dir}/Ralph-Loop/images" + +# 2. 获取内容(如有 URL) +webfetch https://example.com/article + +# 3. 深度调研(使用 Task 工具联网搜索) + +# 4. 生成信息图 +python .opencode/skills/image-service/scripts/text_to_image.py "技术架构图..." --output "{output_dir}/Ralph-Loop/images/architecture.png" + +# 5. 撰写报告 +# 写入 {output_dir}/Ralph-Loop/Ralph-Loop调研报告.md + +# 6. 导出 Word(不使用 --toc,Markdown 已有手写目录) +cd "{output_dir}/Ralph-Loop" +pandoc "Ralph-Loop调研报告.md" -o "Ralph-Loop调研报告.docx" --resource-path=. +python ../../../.opencode/skills/deep-research/scripts/format_docx.py "Ralph-Loop调研报告.docx" +``` + +输出文件: +``` +{output_dir}/Ralph-Loop/ +├── images/ +│ ├── architecture.png +│ └── comparison.png +├── Ralph-Loop调研报告.md +└── Ralph-Loop调研报告.docx +``` + +## 依赖 + +- pandoc:Markdown 转 Word +- python-docx:Word 格式化 +- image-service 技能:生成信息图 diff --git a/deep-research/scripts/format_docx.py b/deep-research/scripts/format_docx.py new file mode 100644 index 0000000..d1961a5 --- /dev/null +++ b/deep-research/scripts/format_docx.py @@ -0,0 +1,332 @@ +#!/usr/bin/env python3 +""" +Word 文档格式化脚本 + +Author: 翟星人 + +功能: +1. 标题居中,黑色字体,加粗 +2. 在日期后插入目录 +3. 一级标题前分页 +4. 表格实线边框,不跨页断开 +5. 日期居中 +6. 图片说明小字居中 +7. 1/2/3级标题加粗 +8. 附录参考文献左对齐 +""" + +import sys +import re +from docx import Document +from docx.shared import Pt, RGBColor +from docx.enum.text import WD_ALIGN_PARAGRAPH +from docx.enum.table import WD_TABLE_ALIGNMENT +from docx.oxml.ns import qn +from docx.oxml import OxmlElement + + +def add_page_break_before(paragraph): + """在段落前添加分页符""" + p = paragraph._p + pPr = p.get_or_add_pPr() + pageBreakBefore = OxmlElement('w:pageBreakBefore') + pPr.insert(0, pageBreakBefore) + + +def set_table_border(table): + """设置表格实线边框""" + tbl = table._tbl + tblPr = tbl.tblPr if tbl.tblPr is not None else OxmlElement('w:tblPr') + + tblBorders = OxmlElement('w:tblBorders') + + for border_name in ['top', 'left', 'bottom', 'right', 'insideH', 'insideV']: + border = OxmlElement(f'w:{border_name}') + border.set(qn('w:val'), 'single') + border.set(qn('w:sz'), '4') + border.set(qn('w:space'), '0') + border.set(qn('w:color'), '000000') + tblBorders.append(border) + + tblPr.append(tblBorders) + if tbl.tblPr is None: + tbl.insert(0, tblPr) + + +def keep_table_together(table): + """保持表格不跨页断开""" + for row in table.rows: + for cell in row.cells: + for paragraph in cell.paragraphs: + pPr = paragraph._p.get_or_add_pPr() + keepNext = OxmlElement('w:keepNext') + keepLines = OxmlElement('w:keepLines') + pPr.append(keepNext) + pPr.append(keepLines) + + +def keep_paragraph_together(paragraph): + """保持段落不断开""" + pPr = paragraph._p.get_or_add_pPr() + keepNext = OxmlElement('w:keepNext') + keepLines = OxmlElement('w:keepLines') + pPr.append(keepNext) + pPr.append(keepLines) + + +def set_heading_style(paragraph, level=1): + """设置标题样式:黑色加粗""" + for run in paragraph.runs: + run.font.color.rgb = RGBColor(0, 0, 0) + run.font.bold = True + if level == 1: + run.font.size = Pt(16) + elif level == 2: + run.font.size = Pt(14) + elif level == 3: + run.font.size = Pt(12) + + +def set_caption_style(paragraph): + """设置图片说明样式:小字居中""" + paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER + for run in paragraph.runs: + run.font.size = Pt(9) + run.font.color.rgb = RGBColor(80, 80, 80) + + +def is_image_caption(text, prev_has_image): + """判断是否为图片说明""" + if prev_has_image and text and len(text) < 100: + # 必须以特定词开头才算图片说明 + if text.startswith("上图") or text.startswith("图:") or text.startswith("图:"): + return True + return False + + +def paragraph_has_image(paragraph): + """检查段落是否包含图片""" + for run in paragraph.runs: + if run._element.xpath('.//w:drawing') or run._element.xpath('.//w:pict'): + return True + return False + + +def is_horizontal_rule(paragraph): + """检查是否为分割线(文本或绘图元素)""" + text = paragraph.text.strip() + # 检查文本形式的分割线 + if text == "---" or text == "***" or text == "___" or (len(text) > 0 and all(c == '-' for c in text)): + return True + # 检查 pandoc 生成的绘图形式水平线(包含 line 或 rect 且文本为空,但不包含图片) + if text == "": + xml_str = paragraph._p.xml + has_drawing = 'w:pict' in xml_str or 'w:drawing' in xml_str + has_line = 'v:line' in xml_str or 'v:rect' in xml_str or ' [output.docx]") + sys.exit(1) + + input_file = sys.argv[1] + output_file = sys.argv[2] if len(sys.argv) > 2 else None + + format_docx(input_file, output_file) diff --git a/image-service/README.md b/image-service/README.md new file mode 100644 index 0000000..6b77495 --- /dev/null +++ b/image-service/README.md @@ -0,0 +1,27 @@ +# Image Service + +图像生成/编辑/分析服务。 + +## 依赖 + +```bash +pip install httpx pillow numpy +``` + +## 配置 + +编辑 `config/settings.json` 或设置环境变量: + +```bash +export IMAGE_API_KEY="your_key" +export IMAGE_API_BASE_URL="https://api.openai.com/v1" +export VISION_API_KEY="your_key" +export VISION_API_BASE_URL="https://api.openai.com/v1" +``` + +## 功能 + +- 文生图 (text_to_image.py) +- 图生图 (image_to_image.py) +- 图片理解 (image_to_text.py) +- 长图拼接 (merge_long_image.py) diff --git a/image-service/SKILL.md b/image-service/SKILL.md new file mode 100644 index 0000000..c84f3c8 --- /dev/null +++ b/image-service/SKILL.md @@ -0,0 +1,132 @@ +--- +name: image-service +description: 多模态图像处理技能,支持文生图、图生图、图生文、长图拼接。当用户提到图片、图像、生成图、信息图、OCR 等关键词时触发。 +--- + +# 图像处理技能 + +## 概述 + +| 能力 | 说明 | 脚本 | +|-----|------|------| +| 文生图 | 根据中文文本描述生成图片 | `scripts/text_to_image.py` | +| 图生图 | 在已有图片基础上进行编辑 | `scripts/image_to_image.py` | +| 图生文 | 分析图片内容(描述、OCR、图表等) | `scripts/image_to_text.py` | +| 长图拼接 | 将多张图片垂直拼接为微信长图 | `scripts/merge_long_image.py` | +| 调研配图 | 预设手绘风格的调研报告信息图 | `scripts/research_image.py` | + +## 配置 + +配置文件:`config/settings.json` + +| 配置项 | 值 | +|-------|-----| +| IMAGE_API_BASE_URL | `${IMAGE_API_BASE_URL}` | +| IMAGE_MODEL | `lyra-flash-9` | +| VISION_MODEL | `qwen2.5-vl-72b-instruct` | + +## 执行规范 + +**图片默认保存到命令执行时的当前工作目录**: + +1. **不要**使用 `workdir` 切换到 skill 目录执行命令 +2. **始终**在用户的工作目录下执行,使用脚本的绝对路径 +3. 脚本路径:`.opencode/skills/image-service/scripts/` + +```bash +# 正确示例 +python .opencode/skills/image-service/scripts/text_to_image.py "描述" -r 3:4 -o output.png +``` + +## 快速使用 + +### 文生图 + +```bash +python .opencode/skills/image-service/scripts/text_to_image.py "信息图风格,标题:AI技术趋势" -r 16:9 +python .opencode/skills/image-service/scripts/text_to_image.py "竖版海报,产品展示" -r 3:4 -o poster.png +``` + +参数:`-r` 宽高比 | `-s` 尺寸 | `-o` 输出路径 + +支持比例:`1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` + +### 图生图 + +```bash +python .opencode/skills/image-service/scripts/image_to_image.py input.png "编辑描述" -r 3:4 +``` + +### 图生文 + +```bash +python .opencode/skills/image-service/scripts/image_to_text.py image.jpg -m describe +python .opencode/skills/image-service/scripts/image_to_text.py screenshot.png -m ocr +``` + +模式:`describe` | `ocr` | `chart` | `fashion` | `product` | `scene` + +### 长图拼接 + +```bash +python .opencode/skills/image-service/scripts/merge_long_image.py img1.png img2.png -o output.png --blend 20 +python .opencode/skills/image-service/scripts/merge_long_image.py -p "*.png" -o long.png --sort name +``` + +参数:`-p` 通配符 | `-o` 输出 | `-w` 宽度 | `-g` 间隔 | `--blend` 融合 | `--sort` 排序 + +### 调研配图 + +```bash +python .opencode/skills/image-service/scripts/research_image.py -t arch -n "标题" -c "内容" -o output.png +``` + +类型:`arch` 架构图 | `flow` 流程图 | `compare` 对比图 | `concept` 概念图 + +## 执行前必做:需求类型判断(铁律) + +**收到图片生成需求后,必须先判断是哪种类型,再决定执行方式:** + +### 长图识别规则 + +提示词中出现以下任一特征,即判定为**长图需求**: + +| 特征类型 | 识别关键词/模式 | +|---------|---------------| +| **明确声明** | 长图、长图海报、垂直长图、微信长图、Infographic、Long Banner | +| **分段结构** | 提示词包含多个段落(如"第1部分"、"顶部"、"中间"、"底部")| +| **编号列表** | 使用 `### 1.`、`### 2.` 等编号分段 | +| **多屏内容** | 描述了3个及以上独立画面/模块 | +| **从上至下** | 出现"从上至下"、"从上到下"等描述 | + +### 判断后的执行路径 + +``` +识别为长图 → 必须先读取 references/long-image-guide.md → 按长图流程执行 +识别为单图 → 直接使用 text_to_image.py 生成 +``` + +**铁律:识别为长图后,禁止直接生成!必须先加载长图指南,按指南流程执行。** + +## 详细指南(按需加载) + +| 场景 | 触发条件 | 参考文档 | +|------|---------|---------| +| 生成多屏长图 | 命中上述长图识别规则 | `references/long-image-guide.md`(必须加载)| +| 图片含中文文字 | 提示词要求图片包含中文标题/文字 | `references/text-rendering-guide.md` | +| 为 PPT/文档配图 | 用户提供了配色要求或参考文档 | `references/color-sync-guide.md` | +| API 接口细节 | 需要了解底层实现 | `docs/api-reference.md` | +| 提示词技巧 | 需要优化提示词效果 | `docs/prompt-guide.md` | + +## 提示词要点 + +1. **必须使用中文**撰写提示词 +2. 图片中的标题、标签**必须为中文** +3. 默认宽高比 **16:9**,可通过 `-r` 参数调整 +4. 推荐风格:信息图、数据可视化、手绘文字、科技插画 + +## 触发关键词 + +- **生成类**:生成图片、创建图片、文生图、图生图、信息图、数据可视化 +- **分析类**:分析图片、OCR、识别文字、图生文 +- **拼接类**:长图、微信长图、拼接图片 diff --git a/image-service/config/settings.json b/image-service/config/settings.json new file mode 100644 index 0000000..765032f --- /dev/null +++ b/image-service/config/settings.json @@ -0,0 +1,42 @@ +{ + "image_api": { + "key": "your_image_api_key", + "base_url": "https://api.openai.com/v1", + "model": "dall-e-3" + }, + "vision_api": { + "key": "your_vision_api_key", + "base_url": "https://api.openai.com/v1", + "model": "gpt-4o" + }, + "defaults": { + "text_to_image": { + "size": "1792x1024", + "response_format": "b64_json" + }, + "image_to_image": { + "size": "1792x1024", + "response_format": "b64_json" + }, + "image_to_text": { + "max_tokens": 2000, + "temperature": 0.7, + "mode": "describe" + } + }, + "limits": { + "max_file_size_mb": 4, + "supported_formats": ["png", "jpg", "jpeg", "webp", "gif"], + "max_prompt_length": 1000, + "timeout_seconds": { + "text_to_image": 180, + "image_to_image": 180, + "image_to_text": 120 + } + }, + "retry": { + "max_attempts": 3, + "backoff_multiplier": 2, + "initial_delay_seconds": 1 + } +} diff --git a/image-service/docs/api-reference.md b/image-service/docs/api-reference.md new file mode 100644 index 0000000..202ea0f --- /dev/null +++ b/image-service/docs/api-reference.md @@ -0,0 +1,233 @@ +# API 参考文档 + +## 概述 + +本技能使用两套 API: +1. **Lyra Flash API** - 用于图像生成和编辑(文生图、图生图) +2. **Qwen2.5-VL API** - 用于视觉识别(图生文) + +--- + +## 一、Lyra Flash API(图像生成) + +### 1.1 基础配置 + +| 配置项 | 值 | +|-------|-----| +| Base URL | `${IMAGE_API_BASE_URL}` | +| Model | `lyra-flash-9` | +| 认证方式 | Bearer Token | + +### 1.2 文生图接口 + +**端点** +``` +POST /images/generations +``` + +**请求头** +```json +{ + "Content-Type": "application/json", + "Authorization": "Bearer ${IMAGE_API_KEY}" +} +``` + +**请求体** +```json +{ + "model": "lyra-flash-9", + "prompt": "中文图像描述", + "size": "1792x1024", + "response_format": "b64_json" +} +``` + +**参数说明** + +| 参数 | 类型 | 必填 | 说明 | +|-----|------|-----|------| +| model | string | 是 | 固定使用 `lyra-flash-9` | +| prompt | string | 是 | 中文图像生成提示词 | +| size | string | 否 | 图片尺寸,默认 `1792x1024` | +| response_format | string | 否 | 响应格式,推荐 `b64_json` | + +**响应体** +```json +{ + "created": 1641234567, + "data": [ + { + "b64_json": "base64编码的图片数据" + } + ] +} +``` + +### 1.3 图生图接口 + +**端点** +``` +POST /images/edits +``` + +**请求体** +```json +{ + "model": "lyra-flash-9", + "prompt": "中文编辑指令", + "image": "data:image/png;base64,{base64数据}", + "size": "1792x1024", + "response_format": "b64_json" +} +``` + +**参数说明** + +| 参数 | 类型 | 必填 | 说明 | +|-----|------|-----|------| +| model | string | 是 | 固定使用 `lyra-flash-9` | +| prompt | string | 是 | 中文图片编辑指令 | +| image | string | 是 | Base64 编码的参考图片(含 data URL 前缀) | +| size | string | 否 | 输出尺寸 | +| response_format | string | 否 | 响应格式 | + +**响应体** +```json +{ + "data": [ + { + "b64_json": "base64编码的生成图片" + } + ] +} +``` + +--- + +## 二、Qwen2.5-VL API(视觉识别) + +### 2.1 基础配置 + +| 配置项 | 值 | +|-------|-----| +| Base URL | `${IMAGE_API_BASE_URL}` | +| Model | `qwen2.5-vl-72b-instruct` | +| 认证方式 | Bearer Token | + +### 2.2 图生文接口 + +**端点** +``` +POST /chat/completions +``` + +**请求头** +```json +{ + "Content-Type": "application/json", + "Authorization": "Bearer ${VISION_API_KEY}" +} +``` + +**请求体** +```json +{ + "model": "qwen2.5-vl-72b-instruct", + "messages": [ + { + "role": "user", + "content": [ + { + "type": "text", + "text": "请描述这张图片" + }, + { + "type": "image_url", + "image_url": { + "url": "data:image/jpeg;base64,{base64数据}" + } + } + ] + } + ], + "max_tokens": 2000, + "temperature": 0.7 +} +``` + +**参数说明** + +| 参数 | 类型 | 必填 | 说明 | +|-----|------|-----|------| +| model | string | 是 | 视觉模型名称 | +| messages | array | 是 | 消息列表,包含文本和图片 | +| max_tokens | int | 否 | 最大输出 token 数 | +| temperature | float | 否 | 温度参数(0-1) | + +**响应体** +```json +{ + "id": "chatcmpl-xxx", + "object": "chat.completion", + "created": 1641234567, + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "这是一张..." + }, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 100, + "completion_tokens": 50, + "total_tokens": 150 + } +} +``` + +--- + +## 三、错误码说明 + +| 状态码 | 说明 | 处理建议 | +|-------|------|---------| +| 400 | 请求参数错误 | 检查请求体格式和参数 | +| 401 | API 密钥无效 | 检查 API Key 是否正确 | +| 403 | 权限不足 | 检查 API Key 权限 | +| 429 | 请求频率限制 | 等待后重试 | +| 500 | 服务器内部错误 | 稍后重试 | +| 503 | 服务不可用 | 稍后重试 | + +--- + +## 四、最佳实践 + +### 4.1 超时设置 + +- 文生图:建议 120-180 秒 +- 图生图:建议 180-300 秒 +- 图生文:建议 60-120 秒 + +### 4.2 重试策略 + +建议实现指数退避重试: +1. 首次重试:等待 1 秒 +2. 第二次重试:等待 2 秒 +3. 第三次重试:等待 4 秒 + +### 4.3 图片格式 + +- 支持格式:PNG、JPG、JPEG、WebP、GIF +- 推荐格式:PNG(无损)或 JPEG(有损但体积小) +- 最大文件大小:建议不超过 4MB + +### 4.4 Base64 编码 + +图片必须使用完整的 Data URL 格式: +``` +... +``` diff --git a/image-service/docs/prompt-guide.md b/image-service/docs/prompt-guide.md new file mode 100644 index 0000000..8cf5ac4 --- /dev/null +++ b/image-service/docs/prompt-guide.md @@ -0,0 +1,215 @@ +# 提示词指南 + +## 概述 + +本指南提供文生图、图生图和图生文三种场景的提示词编写规范和最佳实践。 + +--- + +## 一、文生图提示词 + +### 1.1 基本规则 + +1. **必须使用中文**撰写提示词 +2. 图片中的标题、说明、标签**必须为中文** +3. 默认尺寸为 **16:9(1792x1024)** +4. 结构化描述效果更好 + +### 1.2 标准模板 + +``` +[风格类型],[艺术效果],[分辨率]。 +标题:[中文标题]。 +视觉元素:[主体对象、结构、场景描述]。 +配色:[主色调方案]。 +类型:[具体类型]。 +``` + +### 1.3 推荐风格 + +| 风格 | 适用场景 | +|-----|---------| +| 信息图风格 | 数据展示、流程说明 | +| 数据可视化 | 图表、统计数据 | +| 手绘文字风格 | 笔记、教程 | +| 科技插画风 | 技术文章配图 | +| 扁平化设计 | UI/UX 展示 | +| 3D 渲染风格 | 产品展示 | + +### 1.4 示例 + +**信息图类** +``` +信息图风格插图,手绘文字风格,高清16:9。 +标题:AI技术发展趋势。 +视觉元素:中央AI芯片图标,周围连接云计算、大数据、机器学习图标。 +配色:科技蓝和白色。 +类型:信息图。 +``` + +**数据可视化类** +``` +数据可视化风格,中文标注,高清16:9。 +标题:2026年AI投资趋势。 +视觉元素:柱状图、增长箭头、美元符号。 +配色:金色和科技蓝。 +类型:数据可视化。 +``` + +**产品展示类** +``` +3D产品渲染风格,光影效果,高清16:9。 +标题:智能手表新品发布。 +视觉元素:手表主体居中,周围展示核心功能图标。 +配色:深空灰和玫瑰金。 +类型:产品展示。 +``` + +--- + +## 二、图生图提示词 + +### 2.1 基本规则 + +1. 明确指出**保留什么**和**修改什么** +2. 描述**目标风格**和**期望效果** +3. 提供具体的**细节要求** + +### 2.2 标准模板 + +``` +基于原图进行编辑,[编辑描述]。 +保持:[需要保留的元素]。 +修改:[需要修改的部分]。 +风格:[目标风格]。 +细节:[具体的细节要求]。 +``` + +### 2.3 编辑类型 + +| 类型 | 说明 | 示例 | +|-----|------|-----| +| 风格迁移 | 改变整体风格 | 转为油画风格 | +| 背景替换 | 更换背景 | 将背景改为海滩 | +| 元素添加 | 添加新元素 | 添加文字标题 | +| 元素删除 | 移除元素 | 删除背景人物 | +| 色调调整 | 改变颜色 | 转为暖色调 | +| 质量增强 | 提升质量 | 增加细节和清晰度 | + +### 2.4 示例 + +**风格迁移** +``` +基于原图进行编辑,将整体风格改为科技蓝色调的信息图。 +保持:主体元素和构图。 +修改:所有文字替换为中文标注,背景改为深蓝渐变。 +风格:现代科技感信息图。 +细节:添加数据流动效果和光点装饰。 +``` + +**人物编辑** +``` +基于原图进行编辑,将人物转换为3D科幻风格。 +保持:人物姿态和面部特征。 +修改:服装改为未来感战斗服,增加全息UI界面。 +风格:类似钢铁侠贾维斯系统。 +细节:添加蓝色全息光效和数据面板。 +``` + +**背景替换** +``` +基于原图进行编辑,替换背景为深色科技空间。 +保持:原图主体比例和清晰度。 +修改:背景完全替换,添加中文标题与数据标签。 +风格:深色科技风格。 +细节:背景添加星空和网格线条。 +``` + +--- + +## 三、图生文提示词 + +### 3.1 分析模式 + +| 模式 | 用途 | 提示词 | +|-----|------|-------| +| describe | 通用描述 | 详细描述图片内容 | +| ocr | 文字识别 | 识别图片中的所有文字 | +| chart | 图表分析 | 分析图表数据和趋势 | +| fashion | 穿搭分析 | 分析人物服装搭配 | +| product | 产品分析 | 分析产品特征 | +| scene | 场景分析 | 描述场景环境 | + +### 3.2 自定义提示词示例 + +**详细描述** +``` +请详细描述这张图片的内容,包括: +1. 人物特征和表情 +2. 服装样式和颜色 +3. 画面布局和构图 +4. 艺术风格或摄影风格 +5. 任何文字标注或说明 +6. 背景环境和其他细节 +``` + +**OCR识别** +``` +请仔细识别这张图片中的所有文字内容,包括: +1. 标题和副标题 +2. 正文内容 +3. 图表标签 +4. 按钮文字 +5. 其他任何可见的文字 + +请按照文字在图片中的位置顺序,以清晰的格式输出识别结果。 +``` + +**图表分析** +``` +请分析这张图表的内容,包括: +1. 图表类型(柱状图、折线图、饼图等) +2. 主要数据趋势 +3. 关键数据点 +4. 图表标题和标签 +5. 数据的结论或洞察 + +请用中文详细描述图表传达的信息。 +``` + +**穿搭分析** +``` +请分析这张图片中人物的穿搭,包括: +1. 上装:款式、颜色、材质 +2. 下装:款式、颜色、材质 +3. 鞋履:类型、颜色 +4. 配饰:包包、帽子、眼镜、饰品等 +5. 整体风格:休闲/商务/运动/时尚等 +6. 搭配建议和点评 +``` + +--- + +## 四、最佳实践 + +### 4.1 提示词优化技巧 + +1. **具体明确**:避免模糊描述,使用具体词汇 +2. **结构清晰**:使用分点或模板结构 +3. **重点突出**:将最重要的要求放在前面 +4. **适度详细**:提供足够细节但不要过于冗长 + +### 4.2 常见问题 + +| 问题 | 原因 | 解决方案 | +|-----|------|---------| +| 生成结果与描述不符 | 提示词不够具体 | 添加更多细节描述 | +| 中文显示异常 | 未强调中文要求 | 明确指定"中文标注" | +| 风格不统一 | 风格描述模糊 | 使用具体的风格参考 | +| 元素缺失 | 未明确列出元素 | 逐一列出所需元素 | + +### 4.3 提示词长度建议 + +- 文生图:100-300 字 +- 图生图:50-200 字 +- 图生文:50-150 字 diff --git a/image-service/references/color-sync-guide.md b/image-service/references/color-sync-guide.md new file mode 100644 index 0000000..7426556 --- /dev/null +++ b/image-service/references/color-sync-guide.md @@ -0,0 +1,76 @@ +# 配色协同机制 + +当 image-service 与其他 skill 配合使用时(如 pptx、docx、obsidian 等),**必须感知上下文配色方案并自动适配**,确保生成的图片与目标载体风格统一。 + +## 协同原则 + +1. **主动感知**:生成配图前,先确认目标载体的配色方案 +2. **自动适配**:将配色信息融入图片生成提示词 +3. **风格统一**:背景色、主色调、强调色保持一致 + +## 配色来源优先级 + +| 优先级 | 来源 | 说明 | +|-------|------|------| +| 1 | 用户明确指定 | 用户直接提供的颜色值 | +| 2 | 当前任务上下文 | 正在制作的 PPT/文档的配色方案 | +| 3 | 项目配置文件 | `.design/palette.json` 或类似配置 | +| 4 | 默认风格 | 手绘白底风格(无特殊要求时) | + +## 与 PPTX 协同 + +制作 PPT 配图时,从 pptx skill 的设计方案中提取配色: + +```markdown +# 示例:PPT 配色方案 +- 背景色:#181B24(深蓝黑) +- 主色:#B165FB(紫色) +- 辅助色:#40695B(翡翠绿) +- 文字色:#FFFFFF / #AAAAAA +``` + +生成图片时,将配色融入提示词: + +```bash +# 错误示例(不考虑配色) +python scripts/text_to_image.py "流程图,用户路径变化" -r 16:9 + +# 正确示例(融入配色) +python scripts/text_to_image.py "信息图风格,深色背景#181B24,科技感流程图。用紫色#B165FB和翡翠绿#40695B作为强调色,展示用户路径变化,发光线条风格,中文标签" -r 16:9 +``` + +## 与其他 Skill 协同 + +| 目标载体 | 配色来源 | 适配要点 | +|---------|---------|---------| +| **PPTX** | HTML slides 的 CSS 配色 | 背景色、强调色、文字色统一 | +| **DOCX** | 文档主题色或用户指定 | 配合文档正式/活泼风格 | +| **Obsidian** | Vault 主题(深色/浅色) | 适配笔记阅读体验 | +| **小红书** | 品牌色或内容调性 | 竖版 3:4,吸睛配色 | +| **调研报告** | 统一手绘风格 | 使用 research_image.py 预设 | + +## 配色提示词模板 + +``` +信息图风格,{背景描述}背景{背景色},{风格描述}。 +使用{主色}作为主色调,{辅助色}作为辅助色。 +{内容描述},{视觉风格},中文标签。 +``` + +**示例**: +``` +信息图风格,深色背景#181B24,科技感对比图。 +使用紫色#B165FB作为主色调,翡翠绿#40695B作为辅助色。 +左侧展示SEO特点,右侧展示GEO特点,发光连接线风格,中文标签。 +``` + +## Agent 执行规范 + +1. **识别协同场景**:检测是否在为其他 skill 生成配图 +2. **提取配色方案**:从上下文/HTML/配置中获取颜色值 +3. **构建适配提示词**:将配色信息自然融入生成描述 +4. **验证风格一致**:生成后确认与目标载体视觉协调 + +## 协同执行流程 + +1. 确认目标载体 → 2. 提取配色方案 → 3. 融入提示词 → 4. 生成适配图片 diff --git a/image-service/references/long-image-guide.md b/image-service/references/long-image-guide.md new file mode 100644 index 0000000..e4e2f4a --- /dev/null +++ b/image-service/references/long-image-guide.md @@ -0,0 +1,135 @@ +# 长图生成规范 + +生成需要拼接的长图时,采用**叠罗汉式串行生成**,每张图参考上一张图生成,确保风格一致、衔接自然。 + +## 铁律:执行前必须分析+确认 + +**收到长图需求后,禁止直接开始生成!必须先完成以下步骤:** + +### 第一步:分析提示词结构 + +仔细阅读提示词,识别以下信息: +1. **分屏数量**:提示词中有几个明确的段落/模块? +2. **每屏内容**:每一屏具体要展示什么? +3. **全局风格**:色调、风格、光影等统一要素 +4. **衔接元素**:段落之间用什么元素过渡? + +### 第二步:输出分屏规划表 + +必须用表格形式输出规划,让用户一目了然: + +```markdown +| 屏数 | 内容概要 | 关键元素 | +|-----|---------|---------| +| 1 | 主视觉+标题 | xxx | +| 2 | xxx特写 | xxx | +| ... | ... | ... | + +**全局风格**:xxx风格、xxx色调、xxx布光 +**输出比例**:3:4 +**预计生成**:N张图 → 拼接为长图 +``` + +### 第三步:等待用户确认 + +**必须等用户说"OK"、"开始"、"没问题"后才能开始生成!** + +用户可能会: +- 调整分屏数量 +- 修改某屏内容 +- 补充遗漏的要素 + +## 核心原则:叠罗汉式串行生成 + +**为什么用串行而不是并发?** +- 每张图的顶部颜色需要与上一张图的底部颜色衔接 +- 只有等上一张图生成完成,才能提取其底部色调 +- 串行生成确保每一屏之间的过渡自然无缝 + +**为什么参考上一张而不是首图?** +- 参考首图会导致中间屏幕风格跳跃 +- 叠罗汉式参考让风格逐屏延续,过渡更平滑 +- 每张图只需关心与相邻图的衔接 + +## 生成前校验清单 + +| 检查项 | 要求 | 示例 | +|-------|------|------| +| **比例统一** | 所有分图使用相同 `-r` 参数 | 全部 `-r 3:4` | +| **风格描述统一** | 使用相同的风格关键词 | 全部 `电影级美食摄影风格` | +| **色调统一** | 定义主色调范围 | 全部 `深红色、暖棕色、金色` | + +## Agent 执行流程(铁律) + +``` +1. 收到长图需求 +2. 【分析】仔细阅读提示词,识别分屏结构 +3. 【规划】输出分屏规划表(表格形式) +4. 【确认】等待用户确认后才开始生成(铁律!) +5. 定义全局风格变量(主色调、风格词) +6. 串行生成每一屏: + a. 首屏:用 text_to_image.py 生成,定调 + b. 第2屏:用 image_to_image.py 参考第1屏生成 + c. 第3屏:用 image_to_image.py 参考第2屏生成 + d. 以此类推...每屏参考上一屏 +7. 每屏生成后等待完成,再生成下一屏(串行,不可并发) +8. 全部完成后,使用 --blend 20 拼接输出 +``` + +## 图生图 Prompt 规范 + +**核心要点:顶部衔接上一张底部** + +后续图片的 prompt 必须包含: +1. **顶部衔接声明**:明确顶部颜色/氛围与上一张底部衔接 +2. **风格继承**:参考上一张图的整体风格、光影 +3. **本屏内容**:描述当前屏幕要展示的内容 + +**Prompt 模板:** +``` +参考模板图的整体风格、色调和光影氛围。本屏顶部与上一屏底部自然衔接。{本屏具体内容描述} +``` + +**更精确的写法(推荐):** +``` +参考模板图的{风格}、{色调}、{光影}。顶部延续上一屏底部的{颜色/氛围}。{本屏具体内容描述} +``` + +## 分屏位置规范 + +| 位置 | 处理方式 | +|------|---------| +| **首屏** | 顶部正常开始,底部内容自然过渡(无需刻意留白) | +| **中间屏** | 顶部衔接上一屏底部颜色,底部内容自然过渡 | +| **尾屏** | 顶部衔接上一屏底部颜色,底部正常收尾 | + +**关键:不要预留固定百分比的留白区域,让内容自然过渡即可** + +## 执行示例 + +```bash +# 步骤1:生成首屏(文生图,定调) +python .opencode/skills/image-service/scripts/text_to_image.py "高端美食摄影风格,深红暖棕金色调,电影级布光..." -r 3:4 -o 01_hero.png +# 等待完成 + +# 步骤2:生成第2屏(参考第1屏) +python .opencode/skills/image-service/scripts/image_to_image.py 01_hero.png "参考模板图的美食摄影风格、深红暖棕色调、电影级布光。顶部延续上一屏底部的暖色氛围。本屏内容:酥皮特写..." -r 3:4 -o 02_crisp.png +# 等待完成 + +# 步骤3:生成第3屏(参考第2屏) +python .opencode/skills/image-service/scripts/image_to_image.py 02_crisp.png "参考模板图的美食摄影风格、深红暖棕色调、电影级布光。顶部延续上一屏底部的色调。本屏内容:牛排特写..." -r 3:4 -o 03_tenderloin.png +# 等待完成 + +# ...以此类推 + +# 最后:拼接(推荐 blend 20) +python .opencode/skills/image-service/scripts/merge_long_image.py 01_hero.png 02_crisp.png 03_tenderloin.png ... -o final.png --blend 20 +``` + +## 铁律 + +1. **必须串行生成**:每屏生成完成后再生成下一屏,禁止并发 +2. **叠罗汉式参考**:第N屏参考第N-1屏,不是全部参考首屏 +3. **顶部衔接**:每屏的顶部颜色/氛围必须与上一屏底部衔接 +4. **不留固定留白**:不要预留4%/8%等固定留白,让内容自然过渡 +5. **脚本区分**:首屏用 `text_to_image.py`,后续全部用 `image_to_image.py` diff --git a/image-service/references/text-rendering-guide.md b/image-service/references/text-rendering-guide.md new file mode 100644 index 0000000..bbaf6b0 --- /dev/null +++ b/image-service/references/text-rendering-guide.md @@ -0,0 +1,41 @@ +# 文字清晰规范 + +生成包含中文文字的图片时,**必须在 prompt 末尾追加文字清晰指令**,确保文字可读、无乱码。 + +## 文字清晰后缀(必加) + +``` +【文字渲染要求】 +- 所有中文文字必须清晰可读,笔画完整,无模糊、无乱码、无伪文字 +- 文字边缘锐利,呈现印刷级清晰度,彻底消除压缩噪点与边缘溢色 +- 字体风格统一,字距适中,排版规整 +- 严禁出现无法阅读的乱码字符或残缺笔画 +``` + +## 完整 Prompt 结构 + +``` +{风格描述}。{内容描述}。{布局描述}。 + +【文字渲染要求】 +- 所有中文文字必须清晰可读,笔画完整,无模糊、无乱码、无伪文字 +- 文字边缘锐利,呈现印刷级清晰度 +- 字体风格统一,排版规整 +``` + +## 生成后校验流程 + +1. 生成图片后,用 `image_to_text.py -m ocr` 校验文字是否清晰 +2. 如果 OCR 识别结果与预期文字不符,使用图生图迭代修复 +3. 修复 prompt 使用以下模板 + +## 文字修复 Prompt(图生图迭代修复用) + +``` +执行语意级图像重构。针对图中模糊或乱码的文字区域进行修复: +1. 保持原图的版面配置、物体座标、配色风格完全不变 +2. 将模糊文字修复为清晰的简体中文:{预期文字内容} +3. 文字笔画必须呈现印刷级清晰度,边缘锐利,无压缩噪点 +4. 严禁产生无法阅读的伪文字或乱码 +直接输出修复后的图像。 +``` diff --git a/image-service/scripts/image_to_image.py b/image-service/scripts/image_to_image.py new file mode 100644 index 0000000..57e963a --- /dev/null +++ b/image-service/scripts/image_to_image.py @@ -0,0 +1,273 @@ +#!/usr/bin/env python3 +""" +图生图脚本 (Image-to-Image) +使用 Lyra Flash API 基于参考图片和中文指令进行图片编辑 + +Author: 翟星人 +""" + +import httpx +import base64 +import json +import os +from typing import Dict, Any, Optional, Union +from pathlib import Path + +VALID_ASPECT_RATIOS = [ + "1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9" +] + +VALID_SIZES = [ + "1024x1024", + "1536x1024", "1792x1024", "1344x768", "1248x832", "1184x864", "1152x896", "1536x672", + "1024x1536", "1024x1792", "768x1344", "832x1248", "864x1184", "896x1152" +] + +RATIO_TO_SIZE = { + "1:1": "1024x1024", + "2:3": "832x1248", + "3:2": "1248x832", + "3:4": "1024x1536", + "4:3": "1536x1024", + "4:5": "864x1184", + "5:4": "1184x864", + "9:16": "1024x1792", + "16:9": "1792x1024", + "21:9": "1536x672" +} + + +class ImageToImageEditor: + """图生图编辑器""" + + def __init__(self, config: Optional[Dict[str, str]] = None): + """ + 初始化编辑器 + + Args: + config: 配置字典,包含 api_key, base_url, model + 如果不传则从环境变量或配置文件读取 + """ + if config is None: + config = self._load_config() + + self.api_key = config.get('api_key') or config.get('IMAGE_API_KEY') + self.base_url = config.get('base_url') or config.get('IMAGE_API_BASE_URL') + self.model = config.get('model') or config.get('IMAGE_MODEL') or 'lyra-flash-9' + + if not self.api_key or not self.base_url: + raise ValueError("缺少必要的 API 配置:api_key 和 base_url") + + def _load_config(self) -> Dict[str, str]: + """从配置文件或环境变量加载配置""" + config = {} + + # 尝试从配置文件加载 + config_path = Path(__file__).parent.parent / 'config' / 'settings.json' + if config_path.exists(): + with open(config_path, 'r', encoding='utf-8') as f: + settings = json.load(f) + api_config = settings.get('image_api', {}) + config['api_key'] = api_config.get('key') + config['base_url'] = api_config.get('base_url') + config['model'] = api_config.get('model') + + # 环境变量优先级更高 + config['api_key'] = os.getenv('IMAGE_API_KEY', config.get('api_key')) + config['base_url'] = os.getenv('IMAGE_API_BASE_URL', config.get('base_url')) + config['model'] = os.getenv('IMAGE_MODEL', config.get('model')) + + return config + + @staticmethod + def image_to_base64(image_path: str, with_prefix: bool = True) -> str: + """ + 将图片文件转换为 base64 编码 + + Args: + image_path: 图片文件路径 + with_prefix: 是否添加 data URL 前缀 + + Returns: + base64 编码字符串 + """ + path = Path(image_path) + if not path.exists(): + raise FileNotFoundError(f"图片文件不存在: {image_path}") + + # 获取 MIME 类型 + suffix = path.suffix.lower() + mime_types = { + '.jpg': 'image/jpeg', + '.jpeg': 'image/jpeg', + '.png': 'image/png', + '.gif': 'image/gif', + '.webp': 'image/webp' + } + mime_type = mime_types.get(suffix, 'image/png') + + with open(image_path, 'rb') as f: + b64_str = base64.b64encode(f.read()).decode('utf-8') + + if with_prefix: + return f"data:{mime_type};base64,{b64_str}" + return b64_str + + def edit( + self, + image: Union[str, bytes], + prompt: str, + aspect_ratio: Optional[str] = None, + size: Optional[str] = None, + output_path: Optional[str] = None, + response_format: str = "b64_json" + ) -> Dict[str, Any]: + """ + 编辑图片 + + Args: + image: 图片路径或 base64 字符串 + prompt: 中文编辑指令 + aspect_ratio: 宽高比 (如 3:4, 16:9) + size: 传统尺寸 (如 1024x1792) + output_path: 输出文件路径 + response_format: 响应格式 + + Returns: + 包含编辑结果的字典 + """ + # 处理图片输入 + if isinstance(image, str): + if os.path.isfile(image): + image_b64 = self.image_to_base64(image) + elif image.startswith('data:'): + image_b64 = image + else: + # 假设是纯 base64 字符串 + image_b64 = f"data:image/png;base64,{image}" + else: + image_b64 = f"data:image/png;base64,{base64.b64encode(image).decode('utf-8')}" + + payload: Dict[str, Any] = { + "model": self.model, + "prompt": prompt, + "image": image_b64, + "response_format": response_format + } + + # 确定尺寸:优先用 aspect_ratio 映射,其次用 size + if aspect_ratio: + payload["size"] = RATIO_TO_SIZE.get(aspect_ratio, "1024x1536") + elif size: + payload["size"] = size + else: + payload["size"] = "1024x1536" # 默认 3:4 + + headers = { + "Content-Type": "application/json", + "Authorization": f"Bearer {self.api_key}" + } + + try: + with httpx.Client(timeout=180.0) as client: + response = client.post( + f"{self.base_url}/images/edits", + headers=headers, + json=payload + ) + response.raise_for_status() + result = response.json() + + # 如果指定了输出路径,保存图片 + if output_path and result.get("data"): + b64_data = result["data"][0].get("b64_json") + if b64_data: + self._save_image(b64_data, output_path) + result["saved_path"] = output_path + + return { + "success": True, + "data": result, + "saved_path": output_path if output_path else None + } + + except httpx.HTTPStatusError as e: + return { + "success": False, + "error": f"HTTP 错误: {e.response.status_code}", + "detail": str(e) + } + except Exception as e: + return { + "success": False, + "error": "编辑失败", + "detail": str(e) + } + + def _save_image(self, b64_data: str, output_path: str) -> None: + """保存 base64 图片到文件""" + image_data = base64.b64decode(b64_data) + Path(output_path).parent.mkdir(parents=True, exist_ok=True) + with open(output_path, 'wb') as f: + f.write(image_data) + + +def main(): + """命令行入口""" + import argparse + import time + + parser = argparse.ArgumentParser( + description='图生图编辑工具', + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog=f''' +尺寸参数说明: + -r/--ratio 宽高比(推荐),支持: {", ".join(VALID_ASPECT_RATIOS)} + -s/--size 传统尺寸,支持: {", ".join(VALID_SIZES[:4])}... + +示例: + python image_to_image.py input.png "编辑描述" -r 3:4 + python image_to_image.py input.png "编辑描述" -s 1024x1536 +''' + ) + parser.add_argument('image', help='输入图片路径') + parser.add_argument('prompt', help='中文编辑指令') + parser.add_argument('-o', '--output', help='输出文件路径(默认保存到当前目录)') + parser.add_argument('-r', '--ratio', help=f'宽高比(推荐)。可选: {", ".join(VALID_ASPECT_RATIOS)}') + parser.add_argument('-s', '--size', help='传统尺寸,如 1024x1536') + + args = parser.parse_args() + + if args.ratio and args.ratio not in VALID_ASPECT_RATIOS: + print(f"错误: 不支持的宽高比 '{args.ratio}'") + print(f"支持的宽高比: {', '.join(VALID_ASPECT_RATIOS)}") + return + + if args.size and args.size not in VALID_SIZES: + print(f"警告: 尺寸 '{args.size}' 可能不被支持") + + output_path = args.output + if not output_path: + timestamp = time.strftime("%Y%m%d_%H%M%S") + output_path = f"edited_{timestamp}.png" + + editor = ImageToImageEditor() + result = editor.edit( + image=args.image, + prompt=args.prompt, + aspect_ratio=args.ratio, + size=args.size, + output_path=output_path + ) + + if result["success"]: + print(f"编辑成功!") + if result.get("saved_path"): + print(f"图片已保存到: {result['saved_path']}") + else: + print(f"编辑失败: {result['error']}") + print(f"详情: {result.get('detail', 'N/A')}") + + +if __name__ == "__main__": + main() diff --git a/image-service/scripts/image_to_text.py b/image-service/scripts/image_to_text.py new file mode 100644 index 0000000..4ae66df --- /dev/null +++ b/image-service/scripts/image_to_text.py @@ -0,0 +1,287 @@ +#!/usr/bin/env python3 +""" +图生文脚本 (Image-to-Text) - 视觉识别 +使用 Qwen2.5-VL 模型分析图片内容并生成文字描述 + +Author: 翟星人 +""" + +import httpx +import base64 +import json +import os +from typing import Dict, Any, Optional, Union, List +from pathlib import Path + + +class ImageToTextAnalyzer: + """图生文分析器 - 视觉识别""" + + # 预定义的分析模式 + ANALYSIS_MODES = { + "describe": "请详细描述这张图片的内容,包括:人物、场景、物品、颜色、布局等所有细节。", + "ocr": "请仔细识别这张图片中的所有文字内容,按照文字在图片中的位置顺序输出。如果是中文,请保持原文输出。", + "chart": "请分析这张图表的内容,包括:图表类型、数据趋势、关键数据点、标题标签、以及数据的结论或洞察。", + "fashion": "请分析这张图片中人物的穿搭,包括:服装款式、颜色搭配、配饰、整体风格等。", + "product": "请分析这张产品图片,包括:产品类型、外观特征、功能特点、品牌信息等。", + "scene": "请描述这张图片的场景,包括:地点、环境、氛围、时间(白天/夜晚)等。" + } + + def __init__(self, config: Optional[Dict[str, str]] = None): + """ + 初始化分析器 + + Args: + config: 配置字典,包含 api_key, base_url, model + 如果不传则从环境变量或配置文件读取 + """ + if config is None: + config = self._load_config() + + self.api_key = config.get('api_key') or config.get('VISION_API_KEY') or config.get('IMAGE_API_KEY') + self.base_url = config.get('base_url') or config.get('VISION_API_BASE_URL') or config.get('IMAGE_API_BASE_URL') + self.model = config.get('model') or config.get('VISION_MODEL') or 'qwen2.5-vl-72b-instruct' + + if not self.api_key or not self.base_url: + raise ValueError("缺少必要的 API 配置:api_key 和 base_url") + + def _load_config(self) -> Dict[str, str]: + """从配置文件或环境变量加载配置""" + config = {} + + # 尝试从配置文件加载 + config_path = Path(__file__).parent.parent / 'config' / 'settings.json' + if config_path.exists(): + with open(config_path, 'r', encoding='utf-8') as f: + settings = json.load(f) + # 优先使用 vision_api 配置 + vision_config = settings.get('vision_api', {}) + if vision_config: + config['api_key'] = vision_config.get('key') + config['base_url'] = vision_config.get('base_url') + config['model'] = vision_config.get('model') + else: + # 回退到 image_api 配置 + api_config = settings.get('image_api', {}) + config['api_key'] = api_config.get('key') + config['base_url'] = api_config.get('base_url') + + # 环境变量优先级更高 + config['api_key'] = os.getenv('VISION_API_KEY', os.getenv('IMAGE_API_KEY', config.get('api_key'))) + config['base_url'] = os.getenv('VISION_API_BASE_URL', os.getenv('IMAGE_API_BASE_URL', config.get('base_url'))) + config['model'] = os.getenv('VISION_MODEL', config.get('model', 'qwen2.5-vl-72b-instruct')) + + return config + + @staticmethod + def image_to_base64(image_path: str) -> str: + """ + 将图片文件转换为 base64 编码(带 data URL 前缀) + + Args: + image_path: 图片文件路径 + + Returns: + base64 编码字符串(含 data URL 前缀) + """ + path = Path(image_path) + if not path.exists(): + raise FileNotFoundError(f"图片文件不存在: {image_path}") + + # 获取 MIME 类型 + suffix = path.suffix.lower() + mime_types = { + '.jpg': 'image/jpeg', + '.jpeg': 'image/jpeg', + '.png': 'image/png', + '.gif': 'image/gif', + '.webp': 'image/webp' + } + mime_type = mime_types.get(suffix, 'image/png') + + with open(image_path, 'rb') as f: + b64_str = base64.b64encode(f.read()).decode('utf-8') + + return f"data:{mime_type};base64,{b64_str}" + + def analyze( + self, + image: Union[str, bytes], + prompt: Optional[str] = None, + mode: str = "describe", + max_tokens: int = 2000, + temperature: float = 0.7 + ) -> Dict[str, Any]: + """ + 分析图片并生成文字描述 + + Args: + image: 图片路径、URL 或 base64 字符串 + prompt: 自定义分析提示词(如果提供则忽略 mode) + mode: 分析模式 (describe/ocr/chart/fashion/product/scene) + max_tokens: 最大输出 token 数 + temperature: 温度参数 + + Returns: + 包含分析结果的字典 + """ + # 确定使用的提示词 + if prompt is None: + prompt = self.ANALYSIS_MODES.get(mode, self.ANALYSIS_MODES["describe"]) + + # 处理图片输入 + if isinstance(image, str): + if os.path.isfile(image): + image_url = self.image_to_base64(image) + elif image.startswith('data:') or image.startswith('http'): + image_url = image + else: + # 假设是纯 base64 字符串 + image_url = f"data:image/png;base64,{image}" + else: + image_url = f"data:image/png;base64,{base64.b64encode(image).decode('utf-8')}" + + # 构建请求 + payload = { + "model": self.model, + "messages": [ + { + "role": "user", + "content": [ + { + "type": "text", + "text": prompt + }, + { + "type": "image_url", + "image_url": { + "url": image_url + } + } + ] + } + ], + "max_tokens": max_tokens, + "temperature": temperature + } + + headers = { + "Content-Type": "application/json", + "Authorization": f"Bearer {self.api_key}" + } + + try: + with httpx.Client(timeout=120.0) as client: + response = client.post( + f"{self.base_url}/chat/completions", + headers=headers, + json=payload + ) + response.raise_for_status() + result = response.json() + + # 提取文本内容 + content = result.get("choices", [{}])[0].get("message", {}).get("content", "") + + return { + "success": True, + "content": content, + "mode": mode, + "usage": result.get("usage", {}) + } + + except httpx.HTTPStatusError as e: + return { + "success": False, + "error": f"HTTP 错误: {e.response.status_code}", + "detail": str(e) + } + except Exception as e: + return { + "success": False, + "error": "分析失败", + "detail": str(e) + } + + def describe(self, image: Union[str, bytes]) -> Dict[str, Any]: + """通用图片描述""" + return self.analyze(image, mode="describe") + + def ocr(self, image: Union[str, bytes]) -> Dict[str, Any]: + """文字识别 (OCR)""" + return self.analyze(image, mode="ocr") + + def analyze_chart(self, image: Union[str, bytes]) -> Dict[str, Any]: + """图表分析""" + return self.analyze(image, mode="chart") + + def analyze_fashion(self, image: Union[str, bytes]) -> Dict[str, Any]: + """穿搭分析""" + return self.analyze(image, mode="fashion") + + def analyze_product(self, image: Union[str, bytes]) -> Dict[str, Any]: + """产品分析""" + return self.analyze(image, mode="product") + + def analyze_scene(self, image: Union[str, bytes]) -> Dict[str, Any]: + """场景分析""" + return self.analyze(image, mode="scene") + + def batch_analyze( + self, + images: List[str], + mode: str = "describe" + ) -> List[Dict[str, Any]]: + """ + 批量分析多张图片 + + Args: + images: 图片路径列表 + mode: 分析模式 + + Returns: + 分析结果列表 + """ + results = [] + for image in images: + result = self.analyze(image, mode=mode) + result["image"] = image + results.append(result) + return results + + +def main(): + """命令行入口""" + import argparse + + parser = argparse.ArgumentParser(description='图生文分析工具(视觉识别)') + parser.add_argument('image', help='输入图片路径') + parser.add_argument('-m', '--mode', default='describe', + choices=['describe', 'ocr', 'chart', 'fashion', 'product', 'scene'], + help='分析模式') + parser.add_argument('-p', '--prompt', help='自定义分析提示词') + parser.add_argument('--max-tokens', type=int, default=2000, help='最大输出 token 数') + + args = parser.parse_args() + + analyzer = ImageToTextAnalyzer() + result = analyzer.analyze( + image=args.image, + prompt=args.prompt, + mode=args.mode, + max_tokens=args.max_tokens + ) + + if result["success"]: + print(f"\n=== 分析结果 ({result['mode']}) ===\n") + print(result["content"]) + print(f"\n=== Token 使用 ===") + print(f"输入: {result['usage'].get('prompt_tokens', 'N/A')}") + print(f"输出: {result['usage'].get('completion_tokens', 'N/A')}") + else: + print(f"分析失败: {result['error']}") + print(f"详情: {result.get('detail', 'N/A')}") + + +if __name__ == "__main__": + main() diff --git a/image-service/scripts/merge_long_image.py b/image-service/scripts/merge_long_image.py new file mode 100644 index 0000000..a5dead9 --- /dev/null +++ b/image-service/scripts/merge_long_image.py @@ -0,0 +1,251 @@ +#!/usr/bin/env python3 +""" +长图拼接脚本 (Merge Long Image) +将多张图片按顺序垂直拼接成一张微信长图 + +Author: 翟星人 +""" + +import argparse +import os +import glob as glob_module +from pathlib import Path +from typing import List, Optional, Dict, Any + +from PIL import Image +import numpy as np + + +class LongImageMerger: + """长图拼接器""" + + def __init__(self, target_width: int = 1080): + """ + 初始化拼接器 + + Args: + target_width: 目标宽度,默认1080(微信推荐宽度) + """ + self.target_width = target_width + + def _blend_images(self, img_top: Image.Image, img_bottom: Image.Image, blend_height: int) -> Image.Image: + """ + 在两张图的接缝处创建渐变融合过渡 + + Args: + img_top: 上方图片 + img_bottom: 下方图片 + blend_height: 融合区域高度(像素) + + Returns: + 融合后的下方图片(顶部已与上方图片底部融合) + """ + blend_height = min(blend_height, img_top.height // 4, img_bottom.height // 4) + + top_region = img_top.crop((0, img_top.height - blend_height, img_top.width, img_top.height)) + bottom_region = img_bottom.crop((0, 0, img_bottom.width, blend_height)) + + top_array = np.array(top_region, dtype=np.float32) + bottom_array = np.array(bottom_region, dtype=np.float32) + + alpha = np.linspace(1, 0, blend_height).reshape(-1, 1, 1) + + blended_array = top_array * alpha + bottom_array * (1 - alpha) + blended_array = np.clip(blended_array, 0, 255).astype(np.uint8) + + blended_region = Image.fromarray(blended_array) + + result = img_bottom.copy() + result.paste(blended_region, (0, 0)) + + return result + + def merge( + self, + image_paths: List[str], + output_path: str, + gap: int = 0, + background_color: str = "white", + blend: int = 0 + ) -> Dict[str, Any]: + """ + 拼接多张图片为长图 + + Args: + image_paths: 图片路径列表,按顺序拼接 + output_path: 输出文件路径 + gap: 图片之间的间隔像素,默认0 + background_color: 背景颜色,默认白色 + blend: 接缝融合过渡区域高度(像素),默认0不融合,推荐30-50 + + Returns: + 包含拼接结果的字典 + """ + if not image_paths: + return {"success": False, "error": "没有提供图片路径"} + + valid_paths = [] + for p in image_paths: + if os.path.exists(p): + valid_paths.append(p) + else: + print(f"警告: 文件不存在,跳过 - {p}") + + if not valid_paths: + return {"success": False, "error": "没有有效的图片文件"} + + try: + imgs = [Image.open(p) for p in valid_paths] + + resized_imgs = [] + for img in imgs: + if img.mode in ('RGBA', 'P'): + img = img.convert('RGB') + ratio = self.target_width / img.width + new_height = int(img.height * ratio) + resized = img.resize((self.target_width, new_height), Image.Resampling.LANCZOS) + resized_imgs.append(resized) + + if blend > 0 and len(resized_imgs) > 1: + for i in range(1, len(resized_imgs)): + resized_imgs[i] = self._blend_images(resized_imgs[i-1], resized_imgs[i], blend) + + total_height = sum(img.height for img in resized_imgs) + gap * (len(resized_imgs) - 1) + + long_image = Image.new('RGB', (self.target_width, total_height), background_color) + + y_offset = 0 + for img in resized_imgs: + long_image.paste(img, (0, y_offset)) + y_offset += img.height + gap + + Path(output_path).parent.mkdir(parents=True, exist_ok=True) + long_image.save(output_path, quality=95) + + for img in imgs: + img.close() + for img in resized_imgs: + img.close() + + return { + "success": True, + "saved_path": output_path, + "width": self.target_width, + "height": total_height, + "image_count": len(resized_imgs) + } + + except Exception as e: + return {"success": False, "error": str(e)} + + def merge_from_pattern( + self, + pattern: str, + output_path: str, + sort_by: str = "name", + gap: int = 0, + background_color: str = "white", + blend: int = 0 + ) -> Dict[str, Any]: + """ + 通过 glob 模式匹配图片并拼接 + + Args: + pattern: glob 模式,如 "*.png" 或 "generated_*.png" + output_path: 输出文件路径 + sort_by: 排序方式 - "name"(文件名) / "time"(修改时间) / "none"(不排序) + gap: 图片间隔 + background_color: 背景颜色 + blend: 接缝融合过渡高度 + + Returns: + 包含拼接结果的字典 + """ + image_paths = glob_module.glob(pattern) + + if not image_paths: + return {"success": False, "error": f"没有找到匹配 '{pattern}' 的图片"} + + if sort_by == "name": + image_paths.sort() + elif sort_by == "time": + image_paths.sort(key=lambda x: os.path.getmtime(x)) + + print(f"找到 {len(image_paths)} 张图片:") + for i, p in enumerate(image_paths, 1): + print(f" {i}. {os.path.basename(p)}") + + return self.merge(image_paths, output_path, gap, background_color, blend) + + +def main(): + """命令行入口""" + parser = argparse.ArgumentParser( + description='长图拼接工具 - 将多张图片垂直拼接成微信长图', + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog=""" +示例用法: + # 拼接指定图片 + python merge_long_image.py img1.png img2.png img3.png -o output.png + + # 使用通配符匹配 + python merge_long_image.py -p "generated_*.png" -o long_image.png + + # 指定宽度和间隔 + python merge_long_image.py -p "*.png" -o out.png -w 750 -g 20 + + # 按修改时间排序 + python merge_long_image.py -p "*.png" -o out.png --sort time + + # 启用接缝融合过渡(推荐40px) + python merge_long_image.py img1.png img2.png -o out.png --blend 40 + """ + ) + + parser.add_argument('images', nargs='*', help='要拼接的图片路径列表') + parser.add_argument('-p', '--pattern', help='glob 模式匹配图片,如 "*.png"') + parser.add_argument('-o', '--output', required=True, help='输出文件路径') + parser.add_argument('-w', '--width', type=int, default=1080, help='目标宽度,默认1080') + parser.add_argument('-g', '--gap', type=int, default=0, help='图片间隔像素,默认0') + parser.add_argument('--sort', choices=['name', 'time', 'none'], default='name', + help='排序方式:name(文件名)/time(修改时间)/none') + parser.add_argument('--bg', default='white', help='背景颜色,默认 white') + parser.add_argument('--blend', type=int, default=0, + help='接缝融合过渡高度(像素),推荐30-50,默认0不融合') + + args = parser.parse_args() + + if not args.images and not args.pattern: + parser.error("请提供图片路径列表或使用 -p 指定匹配模式") + + merger = LongImageMerger(target_width=args.width) + + if args.pattern: + result = merger.merge_from_pattern( + pattern=args.pattern, + output_path=args.output, + sort_by=args.sort, + gap=args.gap, + background_color=args.bg, + blend=args.blend + ) + else: + result = merger.merge( + image_paths=args.images, + output_path=args.output, + gap=args.gap, + background_color=args.bg, + blend=args.blend + ) + + if result["success"]: + print(f"\n拼接成功!") + print(f"输出文件: {result['saved_path']}") + print(f"尺寸: {result['width']} x {result['height']}") + print(f"共 {result['image_count']} 张图片") + else: + print(f"\n拼接失败: {result['error']}") + + +if __name__ == "__main__": + main() diff --git a/image-service/scripts/research_image.py b/image-service/scripts/research_image.py new file mode 100644 index 0000000..af5f053 --- /dev/null +++ b/image-service/scripts/research_image.py @@ -0,0 +1,140 @@ +#!/usr/bin/env python3 +""" +调研报告专用信息图生成脚本 +预设手绘风格可视化模板,保持系列配图风格统一 + +Author: 翟星人 +""" + +import argparse +import subprocess +import sys +import os + +# 预设风格模板 - 手绘体可视化风格 +STYLE_TEMPLATES = { + "arch": { + "name": "架构图", + "prefix": "手绘风格技术架构信息图,简洁扁平设计,", + "suffix": "手绘线条感,柔和的科技蓝配色(#4A90D9),浅灰白色背景,模块化分层布局,圆角矩形框,手写体中文标签,简约图标,整体清新专业。", + "trigger": "核心架构、系统结构、技术栈、模块组成" + }, + "flow": { + "name": "流程图", + "prefix": "手绘风格流程信息图,简洁扁平设计,", + "suffix": "手绘线条和箭头,科技蓝(#4A90D9)主色调,浅绿色(#81C784)表示成功节点,浅橙色(#FFB74D)表示判断节点,浅灰白色背景,从上到下或从左到右布局,手写体中文标签,步骤清晰。", + "trigger": "流程、步骤、工作流、执行顺序" + }, + "compare": { + "name": "对比图", + "prefix": "手绘风格对比信息图,左右分栏设计,", + "suffix": "手绘线条感,左侧用柔和蓝色(#4A90D9),右侧用柔和橙色(#FF8A65),中间VS分隔,浅灰白色背景,手写体中文标签,对比项目清晰列出,简约图标点缀。", + "trigger": "对比、vs、区别、差异" + }, + "concept": { + "name": "概念图", + "prefix": "手绘风格概念信息图,中心发散设计,", + "suffix": "手绘线条感,中心主题用科技蓝(#4A90D9),周围要素用柔和的蓝紫渐变色系,浅灰白色背景,连接线条有手绘感,手写体中文标签,布局均衡美观。", + "trigger": "核心概念、要素组成、多个方面" + } +} + +# 基础路径 +BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) +TEXT_TO_IMAGE_SCRIPT = os.path.join(BASE_DIR, "scripts", "text_to_image.py") + + +def generate_image(style: str, title: str, content: str, output: str): + """ + 使用预设风格生成信息图 + + Args: + style: 风格类型 (arch/flow/compare/concept) + title: 图表标题 + content: 图表内容描述 + output: 输出路径 + """ + if style not in STYLE_TEMPLATES: + print(f"错误: 未知风格 '{style}'") + print(f"可用风格: {', '.join(STYLE_TEMPLATES.keys())}") + sys.exit(1) + + template = STYLE_TEMPLATES[style] + + # 组装完整提示词 + prompt = f"{template['prefix']}标题:{title},{content},{template['suffix']}" + + print(f"生成 {template['name']}: {title}") + print(f"风格: 手绘体可视化") + print(f"输出: {output}") + + # 调用 text_to_image.py + cmd = [ + sys.executable, + TEXT_TO_IMAGE_SCRIPT, + prompt, + "--output", output + ] + + result = subprocess.run(cmd, capture_output=False) + + if result.returncode != 0: + print(f"生成失败") + sys.exit(1) + + +def list_styles(): + """列出所有可用风格""" + print("可用风格模板(手绘体可视化):\n") + for key, template in STYLE_TEMPLATES.items(): + print(f" {key:10} - {template['name']}") + print(f" 触发场景: {template['trigger']}") + print() + + +def main(): + parser = argparse.ArgumentParser( + description="调研报告专用信息图生成(手绘风格)", + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog=""" +示例: + # 生成架构图 + python research_image.py -t arch -n "Ralph Loop 核心架构" -c "展示 Prompt、Agent、Stop Hook、Files 四个模块的循环关系" -o images/arch.png + + # 生成流程图 + python research_image.py -t flow -n "Stop Hook 工作流程" -c "Agent尝试退出、Hook触发、检查条件、允许或阻止退出" -o images/flow.png + + # 生成对比图 + python research_image.py -t compare -n "ReAct vs Ralph Loop" -c "左侧ReAct自我评估停止,右侧Ralph外部Hook控制" -o images/compare.png + + # 生成概念图 + python research_image.py -t concept -n "状态持久化" -c "中心是Agent,周围是progress.txt、prd.json、Git历史、代码文件四个要素" -o images/concept.png + + # 查看所有风格 + python research_image.py --list + """ + ) + + parser.add_argument("-t", "--type", choices=list(STYLE_TEMPLATES.keys()), + help="图解类型: arch(架构图), flow(流程图), compare(对比图), concept(概念图)") + parser.add_argument("-n", "--name", help="图表标题") + parser.add_argument("-c", "--content", help="图表内容描述") + parser.add_argument("-o", "--output", help="输出文件路径") + parser.add_argument("--list", action="store_true", help="列出所有可用风格") + + args = parser.parse_args() + + if args.list: + list_styles() + return + + if not all([args.type, args.name, args.content, args.output]): + parser.print_help() + print("\n错误: 必须提供 -t, -n, -c, -o 参数") + sys.exit(1) + + generate_image(args.type, args.name, args.content, args.output) + + +if __name__ == "__main__": + main() diff --git a/image-service/scripts/text_to_image.py b/image-service/scripts/text_to_image.py new file mode 100644 index 0000000..b6ff833 --- /dev/null +++ b/image-service/scripts/text_to_image.py @@ -0,0 +1,350 @@ +#!/usr/bin/env python3 +""" +文生图脚本 (Text-to-Image) +使用 Lyra Flash API 根据中文文本描述生成图片 +支持参考图风格生成 + +Author: 翟星人 +""" + +import httpx +import base64 +import json +import os +from typing import Dict, Any, Optional, Union +from pathlib import Path + +VALID_ASPECT_RATIOS = [ + "1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9" +] + +VALID_SIZES = [ + "1024x1024", + "1536x1024", "1792x1024", "1344x768", "1248x832", "1184x864", "1152x896", "1536x672", + "1024x1536", "1024x1792", "768x1344", "832x1248", "864x1184", "896x1152" +] + +RATIO_TO_SIZE = { + "1:1": "1024x1024", + "2:3": "832x1248", + "3:2": "1248x832", + "3:4": "1024x1536", + "4:3": "1536x1024", + "4:5": "864x1184", + "5:4": "1184x864", + "9:16": "1024x1792", + "16:9": "1792x1024", + "21:9": "1536x672" +} + + +class TextToImageGenerator: + """文生图生成器""" + + def __init__(self, config: Optional[Dict[str, str]] = None): + """ + 初始化生成器 + + Args: + config: 配置字典,包含 api_key, base_url, model + 如果不传则从环境变量或配置文件读取 + """ + if config is None: + config = self._load_config() + + self.api_key = config.get('api_key') or config.get('IMAGE_API_KEY') + self.base_url = config.get('base_url') or config.get('IMAGE_API_BASE_URL') + self.model = config.get('model') or config.get('IMAGE_MODEL') or 'lyra-flash-9' + + if not self.api_key or not self.base_url: + raise ValueError("缺少必要的 API 配置:api_key 和 base_url") + + def _load_config(self) -> Dict[str, str]: + """从配置文件或环境变量加载配置""" + config = {} + + config_path = Path(__file__).parent.parent / 'config' / 'settings.json' + if config_path.exists(): + with open(config_path, 'r', encoding='utf-8') as f: + settings = json.load(f) + api_config = settings.get('image_api', {}) + config['api_key'] = api_config.get('key') + config['base_url'] = api_config.get('base_url') + config['model'] = api_config.get('model') + + config['api_key'] = os.getenv('IMAGE_API_KEY', config.get('api_key')) + config['base_url'] = os.getenv('IMAGE_API_BASE_URL', config.get('base_url')) + config['model'] = os.getenv('IMAGE_MODEL', config.get('model')) + + return config + + @staticmethod + def image_to_base64(image_path: str, with_prefix: bool = True) -> str: + """将图片文件转换为 base64 编码""" + path = Path(image_path) + if not path.exists(): + raise FileNotFoundError(f"图片文件不存在: {image_path}") + + suffix = path.suffix.lower() + mime_types = { + '.jpg': 'image/jpeg', + '.jpeg': 'image/jpeg', + '.png': 'image/png', + '.gif': 'image/gif', + '.webp': 'image/webp' + } + mime_type = mime_types.get(suffix, 'image/png') + + with open(image_path, 'rb') as f: + b64_str = base64.b64encode(f.read()).decode('utf-8') + + if with_prefix: + return f"data:{mime_type};base64,{b64_str}" + return b64_str + + def generate( + self, + prompt: str, + size: Optional[str] = None, + aspect_ratio: Optional[str] = None, + image_size: Optional[str] = None, + output_path: Optional[str] = None, + response_format: str = "b64_json", + ref_image: Optional[str] = None + ) -> Dict[str, Any]: + """ + 生成图片 + + Args: + prompt: 中文图像描述提示词 + size: 图片尺寸 (如 1792x1024),与 aspect_ratio 二选一 + aspect_ratio: 宽高比 (如 16:9, 3:4),推荐使用 + image_size: 分辨率 (1K/2K/4K),仅 gemini-3.0-pro-image-preview 支持 + output_path: 输出文件路径,如果提供则保存图片 + response_format: 响应格式,默认 b64_json + ref_image: 参考图片路径,用于风格参考 + + Returns: + 包含生成结果的字典 + """ + if ref_image: + return self._generate_with_reference( + prompt=prompt, + ref_image=ref_image, + aspect_ratio=aspect_ratio, + size=size, + output_path=output_path, + response_format=response_format + ) + + payload: Dict[str, Any] = { + "model": self.model, + "prompt": prompt, + "response_format": response_format + } + + # 确定尺寸:优先用 aspect_ratio 映射,其次用 size + if aspect_ratio: + payload["size"] = RATIO_TO_SIZE.get(aspect_ratio, "1024x1024") + elif size: + payload["size"] = size + else: + payload["size"] = "1792x1024" # 默认 16:9 + + headers = { + "Content-Type": "application/json", + "Authorization": f"Bearer {self.api_key}" + } + + try: + with httpx.Client(timeout=180.0) as client: + response = client.post( + f"{self.base_url}/images/generations", + headers=headers, + json=payload + ) + response.raise_for_status() + result = response.json() + + if output_path and result.get("data"): + b64_data = result["data"][0].get("b64_json") + if b64_data: + self._save_image(b64_data, output_path) + result["saved_path"] = output_path + + return { + "success": True, + "data": result, + "saved_path": output_path if output_path else None + } + + except httpx.HTTPStatusError as e: + return { + "success": False, + "error": f"HTTP 错误: {e.response.status_code}", + "detail": str(e) + } + except Exception as e: + return { + "success": False, + "error": "生成失败", + "detail": str(e) + } + + def _generate_with_reference( + self, + prompt: str, + ref_image: str, + aspect_ratio: Optional[str] = None, + size: Optional[str] = None, + output_path: Optional[str] = None, + response_format: str = "b64_json" + ) -> Dict[str, Any]: + """ + 参考图片风格生成新图 + + Args: + prompt: 新图内容描述 + ref_image: 参考图片路径 + aspect_ratio: 宽高比 + size: 尺寸 + output_path: 输出路径 + response_format: 响应格式 + """ + image_b64 = self.image_to_base64(ref_image) + + enhanced_prompt = f"参考这张图片的背景风格、配色方案和视觉设计,保持完全一致的风格,生成新内容:{prompt}" + + # 确定尺寸:优先用 aspect_ratio 映射,其次用 size + if size is None: + size = RATIO_TO_SIZE.get(aspect_ratio, "1024x1792") if aspect_ratio else "1024x1792" + + payload = { + "model": self.model, + "prompt": enhanced_prompt, + "image": image_b64, + "size": size, + "response_format": response_format + } + + headers = { + "Content-Type": "application/json", + "Authorization": f"Bearer {self.api_key}" + } + + try: + with httpx.Client(timeout=180.0) as client: + response = client.post( + f"{self.base_url}/images/edits", + headers=headers, + json=payload + ) + response.raise_for_status() + result = response.json() + + if output_path and result.get("data"): + b64_data = result["data"][0].get("b64_json") + if b64_data: + self._save_image(b64_data, output_path) + result["saved_path"] = output_path + + return { + "success": True, + "data": result, + "saved_path": output_path if output_path else None + } + + except httpx.HTTPStatusError as e: + return { + "success": False, + "error": f"HTTP 错误: {e.response.status_code}", + "detail": str(e) + } + except Exception as e: + return { + "success": False, + "error": "生成失败", + "detail": str(e) + } + + def _save_image(self, b64_data: str, output_path: str) -> None: + """保存 base64 图片到文件""" + image_data = base64.b64decode(b64_data) + Path(output_path).parent.mkdir(parents=True, exist_ok=True) + with open(output_path, 'wb') as f: + f.write(image_data) + + +def main(): + """命令行入口""" + import argparse + import time + + parser = argparse.ArgumentParser( + description='文生图工具', + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog=f''' +尺寸参数说明: + -r/--ratio 推荐使用,支持: {", ".join(VALID_ASPECT_RATIOS)} + -s/--size 传统尺寸,支持: {", ".join(VALID_SIZES[:4])}... + --resolution 分辨率(1K/2K/4K),仅 gemini-3.0-pro-image-preview 支持 + --ref 参考图片路径,后续图片将参考首图风格生成 + +示例: + python text_to_image.py "描述" -r 3:4 # 竖版 3:4 + python text_to_image.py "描述" -r 9:16 -o out.png # 竖屏 9:16 + python text_to_image.py "描述" -s 1024x1792 # 传统尺寸 + + # 长图场景:首图定调,后续参考首图风格 + python text_to_image.py "首屏内容" -r 3:4 -o 01.png + python text_to_image.py "第二屏内容" -r 3:4 --ref 01.png -o 02.png +''' + ) + parser.add_argument('prompt', help='中文图像描述提示词') + parser.add_argument('-o', '--output', help='输出文件路径(默认保存到当前目录)') + parser.add_argument('-r', '--ratio', help=f'宽高比,推荐使用。可选: {", ".join(VALID_ASPECT_RATIOS)}') + parser.add_argument('-s', '--size', help='图片尺寸 (如 1792x1024)') + parser.add_argument('--resolution', help='分辨率 (1K/2K/4K),仅部分模型支持') + parser.add_argument('--ref', help='参考图片路径,用于风格参考(长图场景)') + + args = parser.parse_args() + + if args.ratio and args.ratio not in VALID_ASPECT_RATIOS: + print(f"错误: 不支持的宽高比 '{args.ratio}'") + print(f"支持的宽高比: {', '.join(VALID_ASPECT_RATIOS)}") + return + + if args.size and args.size not in VALID_SIZES: + print(f"警告: 尺寸 '{args.size}' 可能不被支持") + print(f"推荐使用 -r/--ratio 参数指定宽高比") + + if args.ref and not os.path.exists(args.ref): + print(f"错误: 参考图片不存在: {args.ref}") + return + + output_path = args.output + if not output_path: + timestamp = time.strftime("%Y%m%d_%H%M%S") + output_path = f"generated_{timestamp}.png" + + generator = TextToImageGenerator() + result = generator.generate( + prompt=args.prompt, + size=args.size, + aspect_ratio=args.ratio, + image_size=args.resolution, + output_path=output_path, + ref_image=args.ref + ) + + if result["success"]: + print(f"生成成功!") + if result.get("saved_path"): + print(f"图片已保存到: {result['saved_path']}") + else: + print(f"生成失败: {result['error']}") + print(f"详情: {result.get('detail', 'N/A')}") + + +if __name__ == "__main__": + main() diff --git a/log-analyzer/README.md b/log-analyzer/README.md new file mode 100644 index 0000000..11cb05a --- /dev/null +++ b/log-analyzer/README.md @@ -0,0 +1,20 @@ +# Log Analyzer + +智能日志分析器,支持多种日志类型。 + +## 依赖 + +无需额外安装,纯 Python 标准库实现。 + +## 功能 + +- 自动识别日志类型(Java App / MySQL Binlog / Nginx / Trace / Alert) +- 提取 20+ 种实体(IP、thread_id、user_id、表名等) +- 敏感操作检测、异常洞察 +- 支持 100M+ 大文件流式处理 + +## 使用 + +```bash +python scripts/preprocess.py <日志文件> -o ./log_analysis +``` diff --git a/log-analyzer/SKILL.md b/log-analyzer/SKILL.md new file mode 100644 index 0000000..d32f480 --- /dev/null +++ b/log-analyzer/SKILL.md @@ -0,0 +1,109 @@ +--- +name: log-analyzer +description: 全维度日志分析技能。自动识别日志类型(Java应用/MySQL Binlog/Nginx/Trace/告警),提取关键实体(IP、thread_id、trace_id、用户、表名等),进行根因定位、告警分析、异常洞察。支持100M+大文件。触发词:分析日志、日志排查、根因定位、告警分析、异常分析。 +--- + +# 日志分析器 + +基于 RAPHL(Recursive Analysis Pattern for Hierarchical Logs)的全维度智能日志分析技能。流式处理,内存占用低,100M+ 日志秒级分析。 + +## 核心能力 + +| 能力 | 说明 | +|------|------| +| 自动识别 | 自动识别日志类型:Java App / MySQL Binlog / Nginx / Trace / Alert | +| 实体提取 | IP、thread_id、trace_id、user_id、session_id、bucket、URL、表名等 20+ 种 | +| 操作分析 | DELETE/UPDATE/INSERT/DROP 等敏感操作检测 | +| 关联分析 | 时间线、因果链、操作链构建 | +| 智能洞察 | 自动生成分析结论、证据、建议 | + +## 支持的日志类型 + +| 类型 | 识别特征 | 提取内容 | +|------|----------|----------| +| **Java App** | ERROR/WARN + 堆栈 | 异常类型、堆栈、logger、时间 | +| **MySQL Binlog** | server id、GTID、Table_map | 表操作、thread_id、server_id、数据变更 | +| **Nginx Access** | IP + HTTP 方法 + 状态码 | 请求IP、URL、状态码、耗时 | +| **Trace** | trace_id、span_id | 链路追踪、调用关系、耗时 | +| **Alert** | CRITICAL/告警 | 告警级别、来源、消息 | +| **General** | 通用 | 时间、IP、关键词 | + +## 使用方法 + +```bash +python .opencode/skills/log-analyzer/scripts/preprocess.py <日志文件> -o ./log_analysis +``` + +## 输出文件 + +| 文件 | 内容 | 用途 | +|------|------|------| +| `summary.md` | 完整分析报告 | **优先阅读** | +| `entities.md` | 实体详情(IP、用户、表名等) | 追溯操作来源 | +| `operations.md` | 操作详情 | 查看具体操作 | +| `insights.md` | 智能洞察 | 问题定位和建议 | +| `analysis.json` | 结构化数据 | 程序处理 | + +## 实体提取清单 + +### 网络/连接类 +- IP 地址、IP:Port、URL、MAC 地址 + +### 追踪/会话类 +- trace_id、span_id、request_id、session_id、thread_id + +### 用户/权限类 +- user_id、ak(access_key)、bucket + +### 数据库类 +- database.table、server_id + +### 性能/状态类 +- duration(耗时)、http_status、error_code + +## 敏感操作检测 + +| 类型 | 检测模式 | 风险级别 | +|------|----------|----------| +| 数据删除 | DELETE, DROP, TRUNCATE | HIGH | +| 数据修改 | UPDATE, ALTER, MODIFY | MEDIUM | +| 权限变更 | GRANT, REVOKE, chmod | HIGH | +| 认证操作 | LOGIN, LOGOUT, AUTH | MEDIUM | + +## 智能洞察类型 + +| 类型 | 说明 | +|------|------| +| security | 大批量删除/修改、权限变更 | +| anomaly | 高频 IP、异常时间段操作 | +| error | 严重异常、错误聚类 | +| audit | 操作来源、用户行为 | + +## 分析流程 + +``` +Phase 1: 日志类型识别(采样前100行) + ↓ +Phase 2: 全量扫描提取(流式处理) + ↓ +Phase 3: 关联分析(时间排序、聚合统计) + ↓ +Phase 4: 智能洞察(异常检测、生成结论) + ↓ +Phase 5: 生成报告(Markdown + JSON) +``` + +## 技术特点 + +| 特点 | 说明 | +|------|------| +| 流式处理 | 逐行读取,100M 文件只占几 MB 内存 | +| 正则预编译 | 20+ 种实体模式预编译,匹配快 | +| 一次遍历 | 提取 + 统计 + 分类一次完成 | +| 类型适配 | 不同日志类型用专用解析器 | + +## 注意事项 + +1. **Binlog 不记录客户端 IP**:只有 server_id 和 thread_id,需结合 general_log 确认操作者 +2. **敏感信息脱敏**:报告中注意不要暴露密码、密钥 +3. **结合多源日志**:binlog + 应用日志 + 审计日志 才能完整还原 diff --git a/log-analyzer/scripts/preprocess.py b/log-analyzer/scripts/preprocess.py new file mode 100644 index 0000000..61d5fbc --- /dev/null +++ b/log-analyzer/scripts/preprocess.py @@ -0,0 +1,849 @@ +#!/usr/bin/env python3 +""" +RAPHL 日志分析器 - 全维度智能分析 + +Author: 翟星人 +Created: 2026-01-18 + +支持多种日志类型的自动识别和深度分析: +- Java/应用日志:异常堆栈、ERROR/WARN +- MySQL Binlog:DDL/DML 操作、表变更、事务分析 +- 审计日志:用户操作、权限变更 +- 告警日志:告警级别、告警源 +- Trace 日志:链路追踪、调用关系 +- 通用日志:IP、时间、关键词提取 + +核心能力: +1. 自动识别日志类型 +2. 提取关键实体(IP、用户、表名、thread_id 等) +3. 时间线分析 +4. 关联分析(因果链、操作链) +5. 智能洞察和异常检测 +""" + +import argparse +import re +import hashlib +import json +from pathlib import Path +from collections import defaultdict, Counter +from datetime import datetime +from dataclasses import dataclass, field +from typing import Optional +from enum import Enum + + +class LogType(Enum): + JAVA_APP = "java_app" + MYSQL_BINLOG = "mysql_binlog" + NGINX_ACCESS = "nginx_access" + AUDIT = "audit" + TRACE = "trace" + ALERT = "alert" + GENERAL = "general" + + +@dataclass +class Entity: + """提取的实体""" + type: str # ip, user, table, thread_id, trace_id, bucket, etc. + value: str + line_num: int + context: str = "" + + +@dataclass +class Operation: + """操作记录""" + line_num: int + time: str + op_type: str # DELETE, UPDATE, INSERT, DROP, SELECT, API_CALL, etc. + target: str # 表名、接口名等 + detail: str + entities: list = field(default_factory=list) # 关联的实体 + raw_content: str = "" + + +@dataclass +class Alert: + """告警记录""" + line_num: int + time: str + level: str # CRITICAL, WARNING, INFO + source: str + message: str + entities: list = field(default_factory=list) + + +@dataclass +class Trace: + """链路追踪""" + trace_id: str + span_id: str + parent_id: str + service: str + operation: str + duration: float + status: str + line_num: int + + +@dataclass +class Insight: + """分析洞察""" + category: str # security, performance, error, anomaly + severity: str # critical, high, medium, low + title: str + description: str + evidence: list = field(default_factory=list) + recommendation: str = "" + + +class SmartLogAnalyzer: + """智能日志分析器 - 全维度感知""" + + # ============ 实体提取模式 ============ + ENTITY_PATTERNS = { + 'ip': re.compile(r'\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b'), + 'ip_port': re.compile(r'\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d+)\b'), + 'mac': re.compile(r'\b([0-9A-Fa-f]{2}[:-]){5}[0-9A-Fa-f]{2}\b'), + 'email': re.compile(r'\b[\w.-]+@[\w.-]+\.\w+\b'), + 'url': re.compile(r'https?://[^\s<>"\']+'), + 'uuid': re.compile(r'\b[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b', re.I), + 'trace_id': re.compile(r'\b(?:trace[_-]?id|traceid|x-trace-id)[=:\s]*([a-zA-Z0-9_-]{16,64})\b', re.I), + 'span_id': re.compile(r'\b(?:span[_-]?id|spanid)[=:\s]*([a-zA-Z0-9_-]{8,32})\b', re.I), + 'request_id': re.compile(r'\b(?:request[_-]?id|req[_-]?id)[=:\s]*([a-zA-Z0-9_-]{8,64})\b', re.I), + 'user_id': re.compile(r'\b(?:user[_-]?id|uid|userid)[=:\s]*([a-zA-Z0-9_-]+)\b', re.I), + 'thread_id': re.compile(r'\bthread[_-]?id[=:\s]*(\d+)\b', re.I), + 'session_id': re.compile(r'\b(?:session[_-]?id|sid)[=:\s]*([a-zA-Z0-9_-]+)\b', re.I), + 'ak': re.compile(r'\b(?:ak|access[_-]?key)[=:\s]*([a-zA-Z0-9]{16,64})\b', re.I), + 'bucket': re.compile(r'\bbucket[=:\s]*([a-zA-Z0-9_.-]+)\b', re.I), + 'database': re.compile(r'`([a-zA-Z_][a-zA-Z0-9_]*)`\.`([a-zA-Z_][a-zA-Z0-9_]*)`'), + 'duration_ms': re.compile(r'\b(?:duration|cost|elapsed|time)[=:\s]*(\d+(?:\.\d+)?)\s*(?:ms|毫秒)\b', re.I), + 'duration_s': re.compile(r'\b(?:duration|cost|elapsed|time)[=:\s]*(\d+(?:\.\d+)?)\s*(?:s|秒)\b', re.I), + 'error_code': re.compile(r'\b(?:error[_-]?code|errno|code)[=:\s]*([A-Z0-9_-]+)\b', re.I), + 'http_status': re.compile(r'\b(?:status|http[_-]?code)[=:\s]*([1-5]\d{2})\b', re.I), + } + + # ============ 时间格式 ============ + TIME_PATTERNS = [ + (re.compile(r'(\d{4}-\d{2}-\d{2}[T ]\d{2}:\d{2}:\d{2}(?:\.\d{3})?)'), '%Y-%m-%d %H:%M:%S'), + (re.compile(r'(\d{2}/\w{3}/\d{4}:\d{2}:\d{2}:\d{2})'), '%d/%b/%Y:%H:%M:%S'), + (re.compile(r'#(\d{6} \d{2}:\d{2}:\d{2})'), '%y%m%d %H:%M:%S'), # MySQL binlog + (re.compile(r'\[(\d{2}/\w{3}/\d{4}:\d{2}:\d{2}:\d{2})'), '%d/%b/%Y:%H:%M:%S'), # Nginx + ] + + # ============ 日志类型识别 ============ + LOG_TYPE_SIGNATURES = { + LogType.MYSQL_BINLOG: [ + re.compile(r'server id \d+.*end_log_pos'), + re.compile(r'GTID.*last_committed'), + re.compile(r'Table_map:.*mapped to number'), + re.compile(r'(Delete_rows|Update_rows|Write_rows|Query).*table id'), + ], + LogType.JAVA_APP: [ + re.compile(r'(ERROR|WARN|INFO|DEBUG)\s+[\w.]+\s+-'), + re.compile(r'^\s+at\s+[\w.$]+\([\w.]+:\d+\)'), + re.compile(r'Exception|Error|Throwable'), + ], + LogType.NGINX_ACCESS: [ + re.compile(r'\d+\.\d+\.\d+\.\d+\s+-\s+-\s+\['), + re.compile(r'"(GET|POST|PUT|DELETE|HEAD|OPTIONS)\s+'), + ], + LogType.TRACE: [ + re.compile(r'trace[_-]?id', re.I), + re.compile(r'span[_-]?id', re.I), + re.compile(r'parent[_-]?id', re.I), + ], + LogType.ALERT: [ + re.compile(r'(CRITICAL|ALERT|EMERGENCY)', re.I), + re.compile(r'告警|报警|alarm', re.I), + ], + } + + # ============ MySQL Binlog 分析 ============ + BINLOG_PATTERNS = { + 'gtid': re.compile(r"GTID_NEXT=\s*'([^']+)'"), + 'thread_id': re.compile(r'thread_id=(\d+)'), + 'server_id': re.compile(r'server id (\d+)'), + 'table_map': re.compile(r'Table_map:\s*`(\w+)`\.`(\w+)`\s*mapped to number (\d+)'), + 'delete_rows': re.compile(r'Delete_rows:\s*table id (\d+)'), + 'update_rows': re.compile(r'Update_rows:\s*table id (\d+)'), + 'write_rows': re.compile(r'Write_rows:\s*table id (\d+)'), + 'query': re.compile(r'Query\s+thread_id=(\d+)'), + 'xid': re.compile(r'Xid\s*=\s*(\d+)'), + 'delete_from': re.compile(r'###\s*DELETE FROM\s*`(\w+)`\.`(\w+)`'), + 'update': re.compile(r'###\s*UPDATE\s*`(\w+)`\.`(\w+)`'), + 'insert': re.compile(r'###\s*INSERT INTO\s*`(\w+)`\.`(\w+)`'), + 'time': re.compile(r'#(\d{6} \d{2}:\d{2}:\d{2})'), + } + + # ============ 告警级别 ============ + ALERT_PATTERNS = { + 'CRITICAL': re.compile(r'\b(CRITICAL|FATAL|EMERGENCY|P0|严重|致命)\b', re.I), + 'HIGH': re.compile(r'\b(ERROR|ALERT|P1|高|错误)\b', re.I), + 'MEDIUM': re.compile(r'\b(WARN|WARNING|P2|中|警告)\b', re.I), + 'LOW': re.compile(r'\b(INFO|NOTICE|P3|低|提示)\b', re.I), + } + + # ============ 敏感操作 ============ + SENSITIVE_OPS = { + 'data_delete': re.compile(r'\b(DELETE|DROP|TRUNCATE|REMOVE)\b', re.I), + 'data_modify': re.compile(r'\b(UPDATE|ALTER|MODIFY|REPLACE)\b', re.I), + 'permission': re.compile(r'\b(GRANT|REVOKE|chmod|chown|赋权|权限)\b', re.I), + 'auth': re.compile(r'\b(LOGIN|LOGOUT|AUTH|认证|登录|登出)\b', re.I), + 'config_change': re.compile(r'\b(SET|CONFIG|配置变更)\b', re.I), + } + + def __init__(self, input_path: str, output_dir: str): + self.input_path = Path(input_path) + self.output_dir = Path(output_dir) + self.output_dir.mkdir(parents=True, exist_ok=True) + + # 分析结果 + self.log_type: LogType = LogType.GENERAL + self.total_lines = 0 + self.file_size_mb = 0 + self.time_range = {'start': '', 'end': ''} + + # 提取的数据 + self.entities: dict[str, list[Entity]] = defaultdict(list) + self.operations: list[Operation] = [] + self.alerts: list[Alert] = [] + self.traces: list[Trace] = [] + self.insights: list[Insight] = [] + + # 统计数据 + self.stats = defaultdict(Counter) + + # Binlog 特有 + self.table_map: dict[str, tuple[str, str]] = {} # table_id -> (db, table) + self.current_thread_id = "" + self.current_server_id = "" + self.current_time = "" + + def run(self) -> dict: + print(f"\n{'='*60}") + print(f"RAPHL 智能日志分析器") + print(f"{'='*60}") + + self.file_size_mb = self.input_path.stat().st_size / (1024 * 1024) + print(f"文件: {self.input_path.name}") + print(f"大小: {self.file_size_mb:.2f} MB") + + # Phase 1: 识别日志类型 + print(f"\n{'─'*40}") + print("Phase 1: 日志类型识别") + print(f"{'─'*40}") + self._detect_log_type() + print(f" ✓ 类型: {self.log_type.value}") + + # Phase 2: 全量扫描提取 + print(f"\n{'─'*40}") + print("Phase 2: 全量扫描提取") + print(f"{'─'*40}") + self._full_scan() + + # Phase 3: 关联分析 + print(f"\n{'─'*40}") + print("Phase 3: 关联分析") + print(f"{'─'*40}") + self._correlate() + + # Phase 4: 生成洞察 + print(f"\n{'─'*40}") + print("Phase 4: 智能洞察") + print(f"{'─'*40}") + self._generate_insights() + + # Phase 5: 生成报告 + print(f"\n{'─'*40}") + print("Phase 5: 生成报告") + print(f"{'─'*40}") + self._generate_reports() + + print(f"\n{'='*60}") + print("分析完成") + print(f"{'='*60}") + + return self._get_summary() + + def _detect_log_type(self): + """识别日志类型""" + sample_lines = [] + with open(self.input_path, 'r', encoding='utf-8', errors='ignore') as f: + for i, line in enumerate(f): + sample_lines.append(line) + if i >= 100: + break + + sample_text = '\n'.join(sample_lines) + + scores = {t: 0 for t in LogType} + for log_type, patterns in self.LOG_TYPE_SIGNATURES.items(): + for pattern in patterns: + matches = pattern.findall(sample_text) + scores[log_type] += len(matches) + + best_type = max(scores.keys(), key=lambda x: scores[x]) + if scores[best_type] > 0: + self.log_type = best_type + else: + self.log_type = LogType.GENERAL + + def _full_scan(self): + """全量扫描提取""" + if self.log_type == LogType.MYSQL_BINLOG: + self._scan_binlog() + elif self.log_type == LogType.JAVA_APP: + self._scan_java_app() + else: + self._scan_general() + + print(f" ✓ 总行数: {self.total_lines:,}") + print(f" ✓ 时间范围: {self.time_range['start']} ~ {self.time_range['end']}") + + # 实体统计 + for entity_type, entities in self.entities.items(): + unique = len(set(e.value for e in entities)) + print(f" ✓ {entity_type}: {unique} 个唯一值, {len(entities)} 次出现") + + if self.operations: + print(f" ✓ 操作记录: {len(self.operations)} 条") + if self.alerts: + print(f" ✓ 告警记录: {len(self.alerts)} 条") + + def _scan_binlog(self): + """扫描 MySQL Binlog""" + current_op = None + + with open(self.input_path, 'r', encoding='utf-8', errors='ignore') as f: + for line_num, line in enumerate(f, 1): + self.total_lines += 1 + + # 提取时间 + time_match = self.BINLOG_PATTERNS['time'].search(line) + if time_match: + self.current_time = time_match.group(1) + self._update_time_range(self.current_time) + + # 提取 server_id + server_match = self.BINLOG_PATTERNS['server_id'].search(line) + if server_match: + self.current_server_id = server_match.group(1) + self._add_entity('server_id', self.current_server_id, line_num, line) + + # 提取 thread_id + thread_match = self.BINLOG_PATTERNS['thread_id'].search(line) + if thread_match: + self.current_thread_id = thread_match.group(1) + self._add_entity('thread_id', self.current_thread_id, line_num, line) + + # 提取 table_map + table_match = self.BINLOG_PATTERNS['table_map'].search(line) + if table_match: + db, table, table_id = table_match.groups() + self.table_map[table_id] = (db, table) + self._add_entity('database', f"{db}.{table}", line_num, line) + + # 识别操作类型 + for op_name, pattern in [ + ('DELETE', self.BINLOG_PATTERNS['delete_from']), + ('UPDATE', self.BINLOG_PATTERNS['update']), + ('INSERT', self.BINLOG_PATTERNS['insert']), + ]: + match = pattern.search(line) + if match: + db, table = match.groups() + self.stats['operations'][op_name] += 1 + self.stats['tables'][f"{db}.{table}"] += 1 + + if current_op is None or current_op.target != f"{db}.{table}": + if current_op: + self.operations.append(current_op) + current_op = Operation( + line_num=line_num, + time=self.current_time, + op_type=op_name, + target=f"{db}.{table}", + detail="", + entities=[ + Entity('thread_id', self.current_thread_id, line_num), + Entity('server_id', self.current_server_id, line_num), + ], + raw_content=line + ) + + # 提取行内实体(IP、用户等) + self._extract_entities(line, line_num) + + if line_num % 50000 == 0: + print(f" 已处理 {line_num:,} 行...") + + if current_op: + self.operations.append(current_op) + + def _scan_java_app(self): + """扫描 Java 应用日志""" + current_exception = None + context_buffer = [] + + error_pattern = re.compile( + r'^(\d{4}-\d{2}-\d{2}[T ]\d{2}:\d{2}:\d{2}(?:\.\d{3})?)\s+' + r'(FATAL|ERROR|WARN|WARNING|INFO|DEBUG)\s+' + r'([\w.]+)\s+-\s+(.+)$' + ) + stack_pattern = re.compile(r'^\s+at\s+') + exception_pattern = re.compile(r'^([a-zA-Z_$][\w.$]*(?:Exception|Error|Throwable)):\s*(.*)$') + + with open(self.input_path, 'r', encoding='utf-8', errors='ignore') as f: + for line_num, line in enumerate(f, 1): + self.total_lines += 1 + line = line.rstrip() + + # 提取实体 + self._extract_entities(line, line_num) + + error_match = error_pattern.match(line) + if error_match: + time_str, level, logger, message = error_match.groups() + self._update_time_range(time_str) + + if level in ('ERROR', 'FATAL', 'WARN', 'WARNING'): + if current_exception: + self._finalize_exception(current_exception) + + current_exception = { + 'line_num': line_num, + 'time': time_str, + 'level': level, + 'logger': logger, + 'message': message, + 'stack': [], + 'context': list(context_buffer), + 'entities': [], + } + context_buffer.clear() + elif current_exception: + if stack_pattern.match(line) or exception_pattern.match(line): + current_exception['stack'].append(line) + elif line.startswith('Caused by:'): + current_exception['stack'].append(line) + else: + self._finalize_exception(current_exception) + current_exception = None + context_buffer.append(line) + else: + context_buffer.append(line) + if len(context_buffer) > 5: + context_buffer.pop(0) + + if line_num % 50000 == 0: + print(f" 已处理 {line_num:,} 行...") + + if current_exception: + self._finalize_exception(current_exception) + + def _finalize_exception(self, exc: dict): + """完成异常记录""" + level_map = {'FATAL': 'CRITICAL', 'ERROR': 'HIGH', 'WARN': 'MEDIUM', 'WARNING': 'MEDIUM'} + + self.alerts.append(Alert( + line_num=exc['line_num'], + time=exc['time'], + level=level_map.get(exc['level'], 'LOW'), + source=exc['logger'], + message=exc['message'], + entities=exc.get('entities', []) + )) + + if exc['stack']: + self.stats['exceptions'][exc['stack'][0].split(':')[0] if ':' in exc['stack'][0] else exc['level']] += 1 + + def _scan_general(self): + """通用日志扫描""" + with open(self.input_path, 'r', encoding='utf-8', errors='ignore') as f: + for line_num, line in enumerate(f, 1): + self.total_lines += 1 + + # 提取时间 + for pattern, fmt in self.TIME_PATTERNS: + match = pattern.search(line) + if match: + self._update_time_range(match.group(1)) + break + + # 提取实体 + self._extract_entities(line, line_num) + + # 识别告警 + for level, pattern in self.ALERT_PATTERNS.items(): + if pattern.search(line): + self.stats['alert_levels'][level] += 1 + break + + # 识别敏感操作 + for op_type, pattern in self.SENSITIVE_OPS.items(): + if pattern.search(line): + self.stats['sensitive_ops'][op_type] += 1 + + if line_num % 50000 == 0: + print(f" 已处理 {line_num:,} 行...") + + def _extract_entities(self, line: str, line_num: int): + """提取行内实体""" + for entity_type, pattern in self.ENTITY_PATTERNS.items(): + for match in pattern.finditer(line): + value = match.group(1) if match.lastindex else match.group(0) + self._add_entity(entity_type, value, line_num, line[:200]) + + def _add_entity(self, entity_type: str, value: str, line_num: int, context: str = ""): + """添加实体""" + # 过滤无效值 + if entity_type == 'ip' and value in ('0.0.0.0', '127.0.0.1', '255.255.255.255'): + return + if entity_type == 'duration_ms' and float(value) == 0: + return + + self.entities[entity_type].append(Entity( + type=entity_type, + value=value, + line_num=line_num, + context=context + )) + self.stats[f'{entity_type}_count'][value] += 1 + + def _update_time_range(self, time_str: str): + """更新时间范围""" + if not self.time_range['start'] or time_str < self.time_range['start']: + self.time_range['start'] = time_str + if not self.time_range['end'] or time_str > self.time_range['end']: + self.time_range['end'] = time_str + + def _correlate(self): + """关联分析""" + # 操作按时间排序 + if self.operations: + self.operations.sort(key=lambda x: x.time) + print(f" ✓ 操作时间线: {len(self.operations)} 条") + + # 聚合相同操作 + if self.log_type == LogType.MYSQL_BINLOG: + op_summary: dict[str, dict] = {} + for op in self.operations: + key = op.op_type + if key not in op_summary: + op_summary[key] = {'count': 0, 'tables': Counter(), 'thread_ids': set()} + op_summary[key]['count'] += 1 + op_summary[key]['tables'][op.target] += 1 + for e in op.entities: + if e.type == 'thread_id': + op_summary[key]['thread_ids'].add(e.value) + + for op_type, data in op_summary.items(): + tables_count = len(data['tables']) + thread_count = len(data['thread_ids']) + print(f" ✓ {op_type}: {data['count']} 次, 涉及 {tables_count} 个表, {thread_count} 个 thread_id") + + # IP 活动分析 + if 'ip' in self.entities: + ip_activity = Counter(e.value for e in self.entities['ip']) + top_ips = ip_activity.most_common(5) + if top_ips: + print(f" ✓ Top IP:") + for ip, count in top_ips: + print(f" {ip}: {count} 次") + + def _generate_insights(self): + """生成智能洞察""" + + # Binlog 洞察 + if self.log_type == LogType.MYSQL_BINLOG: + # 大批量删除检测 + delete_count = self.stats['operations'].get('DELETE', 0) + if delete_count > 100: + tables = self.stats['tables'].most_common(5) + thread_ids = list(set(e.value for e in self.entities.get('thread_id', []))) + server_ids = list(set(e.value for e in self.entities.get('server_id', []))) + + self.insights.append(Insight( + category='security', + severity='high', + title=f'大批量删除操作检测', + description=f'检测到 {delete_count} 条 DELETE 操作', + evidence=[ + f"时间范围: {self.time_range['start']} ~ {self.time_range['end']}", + f"涉及表: {', '.join(f'{t[0]}({t[1]}次)' for t in tables)}", + f"Server ID: {', '.join(server_ids)}", + f"Thread ID: {', '.join(thread_ids[:5])}{'...' if len(thread_ids) > 5 else ''}", + ], + recommendation='确认操作来源:1. 根据 thread_id 查询应用连接 2. 检查对应时间段的应用日志 3. 确认是否为正常业务行为' + )) + + # 操作来源分析 + if self.entities.get('server_id'): + unique_servers = set(e.value for e in self.entities['server_id']) + if len(unique_servers) == 1: + server_id = list(unique_servers)[0] + self.insights.append(Insight( + category='audit', + severity='medium', + title='操作来源确认', + description=f'所有操作来自同一数据库实例 server_id={server_id}', + evidence=[ + f"Server ID: {server_id}", + f"这是数据库主库的标识,不是客户端 IP", + f"Binlog 不记录客户端 IP,需查 general_log 或审计日志", + ], + recommendation='如需确认操作者 IP,请检查:1. MySQL general_log 2. 审计插件日志 3. 应用服务连接日志' + )) + + # 异常洞察 + if self.alerts: + critical_count = sum(1 for a in self.alerts if a.level == 'CRITICAL') + if critical_count > 0: + self.insights.append(Insight( + category='error', + severity='critical', + title=f'严重异常检测', + description=f'检测到 {critical_count} 个严重级别异常', + evidence=[f"L{a.line_num}: {a.message[:100]}" for a in self.alerts if a.level == 'CRITICAL'][:5], + recommendation='立即检查相关服务状态' + )) + + # IP 异常检测 + if 'ip' in self.entities: + ip_counter = Counter(e.value for e in self.entities['ip']) + for ip, count in ip_counter.most_common(3): + if count > 100: + self.insights.append(Insight( + category='anomaly', + severity='medium', + title=f'高频 IP 活动', + description=f'IP {ip} 出现 {count} 次', + evidence=[e.context[:100] for e in self.entities['ip'] if e.value == ip][:3], + recommendation='确认该 IP 的活动是否正常' + )) + + print(f" ✓ 生成 {len(self.insights)} 条洞察") + for insight in self.insights: + print(f" [{insight.severity.upper()}] {insight.title}") + + def _generate_reports(self): + """生成报告""" + self._write_summary() + self._write_entities() + self._write_operations() + self._write_insights() + self._write_json() + + print(f"\n输出文件:") + for f in sorted(self.output_dir.iterdir()): + size = f.stat().st_size + print(f" - {f.name} ({size/1024:.1f} KB)") + + def _write_summary(self): + """写入摘要报告""" + path = self.output_dir / "summary.md" + with open(path, 'w', encoding='utf-8') as f: + f.write(f"# 日志分析报告\n\n") + + f.write(f"## 概览\n\n") + f.write(f"| 项目 | 内容 |\n|------|------|\n") + f.write(f"| 文件 | {self.input_path.name} |\n") + f.write(f"| 大小 | {self.file_size_mb:.2f} MB |\n") + f.write(f"| 类型 | {self.log_type.value} |\n") + f.write(f"| 总行数 | {self.total_lines:,} |\n") + f.write(f"| 时间范围 | {self.time_range['start']} ~ {self.time_range['end']} |\n\n") + + # 实体统计 + if self.entities: + f.write(f"## 实体统计\n\n") + f.write(f"| 类型 | 唯一值 | 出现次数 | Top 值 |\n|------|--------|----------|--------|\n") + for entity_type, entities in sorted(self.entities.items()): + counter = Counter(e.value for e in entities) + unique = len(counter) + total = len(entities) + top = counter.most_common(1)[0] if counter else ('', 0) + f.write(f"| {entity_type} | {unique} | {total} | {top[0][:30]}({top[1]}) |\n") + f.write(f"\n") + + # 操作统计 + if self.stats['operations']: + f.write(f"## 操作统计\n\n") + f.write(f"| 操作类型 | 次数 |\n|----------|------|\n") + for op, count in self.stats['operations'].most_common(): + f.write(f"| {op} | {count:,} |\n") + f.write(f"\n") + + if self.stats['tables']: + f.write(f"## 表操作统计\n\n") + f.write(f"| 表名 | 操作次数 |\n|------|----------|\n") + for table, count in self.stats['tables'].most_common(10): + f.write(f"| {table} | {count:,} |\n") + f.write(f"\n") + + # 洞察 + if self.insights: + f.write(f"## 分析洞察\n\n") + for i, insight in enumerate(self.insights, 1): + f.write(f"### {i}. [{insight.severity.upper()}] {insight.title}\n\n") + f.write(f"{insight.description}\n\n") + if insight.evidence: + f.write(f"**证据:**\n") + for e in insight.evidence: + f.write(f"- {e}\n") + f.write(f"\n") + if insight.recommendation: + f.write(f"**建议:** {insight.recommendation}\n\n") + f.write(f"---\n\n") + + def _write_entities(self): + """写入实体详情""" + path = self.output_dir / "entities.md" + with open(path, 'w', encoding='utf-8') as f: + f.write(f"# 实体详情\n\n") + + for entity_type, entities in sorted(self.entities.items()): + counter = Counter(e.value for e in entities) + f.write(f"## {entity_type} ({len(counter)} 个唯一值)\n\n") + f.write(f"| 值 | 出现次数 | 首次行号 |\n|-----|----------|----------|\n") + + first_occurrence = {} + for e in entities: + if e.value not in first_occurrence: + first_occurrence[e.value] = e.line_num + + for value, count in counter.most_common(50): + f.write(f"| {value[:50]} | {count} | {first_occurrence[value]} |\n") + f.write(f"\n") + + def _write_operations(self): + """写入操作详情""" + if not self.operations: + return + + path = self.output_dir / "operations.md" + with open(path, 'w', encoding='utf-8') as f: + f.write(f"# 操作详情\n\n") + f.write(f"共 {len(self.operations)} 条操作记录\n\n") + + # 按表分组 + by_table = defaultdict(list) + for op in self.operations: + by_table[op.target].append(op) + + for table, ops in sorted(by_table.items(), key=lambda x: len(x[1]), reverse=True): + f.write(f"## {table} ({len(ops)} 次操作)\n\n") + + op_types = Counter(op.op_type for op in ops) + f.write(f"操作类型: {dict(op_types)}\n\n") + + thread_ids = set() + for op in ops: + for e in op.entities: + if e.type == 'thread_id': + thread_ids.add(e.value) + + if thread_ids: + f.write(f"Thread IDs: {', '.join(sorted(thread_ids))}\n\n") + + f.write(f"时间范围: {ops[0].time} ~ {ops[-1].time}\n\n") + f.write(f"---\n\n") + + def _write_insights(self): + """写入洞察报告""" + if not self.insights: + return + + path = self.output_dir / "insights.md" + with open(path, 'w', encoding='utf-8') as f: + f.write(f"# 分析洞察\n\n") + + # 按严重程度分组 + by_severity = defaultdict(list) + for insight in self.insights: + by_severity[insight.severity].append(insight) + + for severity in ['critical', 'high', 'medium', 'low']: + if severity not in by_severity: + continue + + f.write(f"## {severity.upper()} 级别\n\n") + for insight in by_severity[severity]: + f.write(f"### {insight.title}\n\n") + f.write(f"**类别:** {insight.category}\n\n") + f.write(f"**描述:** {insight.description}\n\n") + + if insight.evidence: + f.write(f"**证据:**\n") + for e in insight.evidence: + f.write(f"- {e}\n") + f.write(f"\n") + + if insight.recommendation: + f.write(f"**建议:** {insight.recommendation}\n\n") + + f.write(f"---\n\n") + + def _write_json(self): + """写入 JSON 数据""" + path = self.output_dir / "analysis.json" + + data = { + 'file': str(self.input_path), + 'size_mb': self.file_size_mb, + 'log_type': self.log_type.value, + 'total_lines': self.total_lines, + 'time_range': self.time_range, + 'entities': { + k: { + 'unique': len(set(e.value for e in v)), + 'total': len(v), + 'top': Counter(e.value for e in v).most_common(10) + } + for k, v in self.entities.items() + }, + 'stats': {k: dict(v) for k, v in self.stats.items()}, + 'insights': [ + { + 'category': i.category, + 'severity': i.severity, + 'title': i.title, + 'description': i.description, + 'evidence': i.evidence, + 'recommendation': i.recommendation + } + for i in self.insights + ] + } + + with open(path, 'w', encoding='utf-8') as f: + json.dump(data, f, ensure_ascii=False, indent=2) + + def _get_summary(self) -> dict: + return { + 'log_type': self.log_type.value, + 'total_lines': self.total_lines, + 'entity_types': len(self.entities), + 'operation_count': len(self.operations), + 'insight_count': len(self.insights), + 'output_dir': str(self.output_dir) + } + + +def main(): + parser = argparse.ArgumentParser(description='RAPHL 智能日志分析器') + parser.add_argument('input', help='输入日志文件') + parser.add_argument('-o', '--output', default='./log_analysis', help='输出目录') + + args = parser.parse_args() + + analyzer = SmartLogAnalyzer(args.input, args.output) + result = analyzer.run() + + print(f"\n请查看 {result['output_dir']}/summary.md") + + +if __name__ == '__main__': + main() diff --git a/mcp-builder/SKILL.md b/mcp-builder/SKILL.md new file mode 100644 index 0000000..7fbc41d --- /dev/null +++ b/mcp-builder/SKILL.md @@ -0,0 +1,326 @@ +--- +name: mcp-builder +description: Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK). +license: Complete terms in LICENSE.txt +--- + +# MCP Server Development Guide + +## Overview + +To create high-quality MCP (Model Context Protocol) servers that enable LLMs to effectively interact with external services, use this skill. An MCP server provides tools that allow LLMs to access external services and APIs. The quality of an MCP server is measured by how well it enables LLMs to accomplish real-world tasks using the tools provided. + +--- + +# Process + +## High-Level Workflow + +Creating a high-quality MCP server involves four main phases: + +### Phase 1: Deep Research and Planning + +#### 1.1 Understand Agent-Centric Design Principles + +Before diving into implementation, understand how to design tools for AI agents by reviewing these principles: + +**Build for Workflows, Not Just API Endpoints:** +- Don't simply wrap existing API endpoints - build thoughtful, high-impact workflow tools +- Consolidate related operations (e.g., `schedule_event` that both checks availability and creates event) +- Focus on tools that enable complete tasks, not just individual API calls +- Consider what workflows agents actually need to accomplish + +**Optimize for Limited Context:** +- Agents have constrained context windows - make every token count +- Return high-signal information, not exhaustive data dumps +- Provide "concise" vs "detailed" response format options +- Default to human-readable identifiers over technical codes (names over IDs) +- Consider the agent's context budget as a scarce resource + +**Design Actionable Error Messages:** +- Error messages should guide agents toward correct usage patterns +- Suggest specific next steps: "Try using filter='active_only' to reduce results" +- Make errors educational, not just diagnostic +- Help agents learn proper tool usage through clear feedback + +**Follow Natural Task Subdivisions:** +- Tool names should reflect how humans think about tasks +- Group related tools with consistent prefixes for discoverability +- Design tools around natural workflows, not just API structure + +**Use Evaluation-Driven Development:** +- Create realistic evaluation scenarios early +- Let agent feedback drive tool improvements +- Prototype quickly and iterate based on actual agent performance + +#### 1.2 Study MCP Protocol Documentation + +**Fetch the latest MCP protocol documentation:** + +Use WebFetch to load: `https://modelcontextprotocol.io/llms-full.txt` + +This comprehensive document contains the complete MCP specification and guidelines. + +#### 1.3 Study Framework Documentation + +**Load and read the following reference files:** + +- **MCP Best Practices**: [View Best Practices](./reference/mcp_best_practices.md) - Core guidelines for all MCP servers + +**For Python implementations, also load:** +- **Python SDK Documentation**: Use WebFetch to load `https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md` +- [Python Implementation Guide](./reference/python_mcp_server.md) - Python-specific best practices and examples + +**For Node/TypeScript implementations, also load:** +- **TypeScript SDK Documentation**: Use WebFetch to load `https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md` +- [TypeScript Implementation Guide](./reference/node_mcp_server.md) - Node/TypeScript-specific best practices and examples + +#### 1.4 Exhaustively Study API Documentation + +To integrate a service, read through **ALL** available API documentation: +- Official API reference documentation +- Authentication and authorization requirements +- Rate limiting and pagination patterns +- Error responses and status codes +- Available endpoints and their parameters +- Data models and schemas + +**To gather comprehensive information, use web search and the WebFetch tool as needed.** + +#### 1.5 Create a Comprehensive Implementation Plan + +Based on your research, create a detailed plan that includes: + +**Tool Selection:** +- List the most valuable endpoints/operations to implement +- Prioritize tools that enable the most common and important use cases +- Consider which tools work together to enable complex workflows + +**Shared Utilities and Helpers:** +- Identify common API request patterns +- Plan pagination helpers +- Design filtering and formatting utilities +- Plan error handling strategies + +**Input/Output Design:** +- Define input validation models (Pydantic for Python, Zod for TypeScript) +- Design consistent response formats (e.g., JSON or Markdown), and configurable levels of detail (e.g., Detailed or Concise) +- Plan for large-scale usage (thousands of users/resources) +- Implement character limits and truncation strategies (e.g., 25,000 tokens) + +**Error Handling Strategy:** +- Plan graceful failure modes +- Design clear, actionable, LLM-friendly, natural language error messages which prompt further action +- Consider rate limiting and timeout scenarios +- Handle authentication and authorization errors + +--- + +### Phase 2: Implementation + +Now that you have a comprehensive plan, begin implementation following language-specific best practices. + +#### 2.1 Set Up Project Structure + +**For Python:** +- Create a single `.py` file or organize into modules if complex (see [Python Guide](./reference/python_mcp_server.md)) +- Use the MCP Python SDK for tool registration +- Define Pydantic models for input validation + +**For Node/TypeScript:** +- Create proper project structure (see [TypeScript Guide](./reference/node_mcp_server.md)) +- Set up `package.json` and `tsconfig.json` +- Use MCP TypeScript SDK +- Define Zod schemas for input validation + +#### 2.2 Implement Core Infrastructure First + +**To begin implementation, create shared utilities before implementing tools:** +- API request helper functions +- Error handling utilities +- Response formatting functions (JSON and Markdown) +- Pagination helpers +- Authentication/token management + +#### 2.3 Implement Tools Systematically + +For each tool in the plan: + +**Define Input Schema:** +- Use Pydantic (Python) or Zod (TypeScript) for validation +- Include proper constraints (min/max length, regex patterns, min/max values, ranges) +- Provide clear, descriptive field descriptions +- Include diverse examples in field descriptions + +**Write Comprehensive Docstrings/Descriptions:** +- One-line summary of what the tool does +- Detailed explanation of purpose and functionality +- Explicit parameter types with examples +- Complete return type schema +- Usage examples (when to use, when not to use) +- Error handling documentation, which outlines how to proceed given specific errors + +**Implement Tool Logic:** +- Use shared utilities to avoid code duplication +- Follow async/await patterns for all I/O +- Implement proper error handling +- Support multiple response formats (JSON and Markdown) +- Respect pagination parameters +- Check character limits and truncate appropriately + +**Add Tool Annotations:** +- `readOnlyHint`: true (for read-only operations) +- `destructiveHint`: false (for non-destructive operations) +- `idempotentHint`: true (if repeated calls have same effect) +- `openWorldHint`: true (if interacting with external systems) + +#### 2.4 Follow Language-Specific Best Practices + +**For Python: Load [Python Implementation Guide](./reference/python_mcp_server.md) and ensure the following:** +- Using MCP Python SDK with proper tool registration +- Pydantic v2 models with `model_config` +- Type hints throughout +- Async/await for all I/O operations +- Proper imports organization +- Module-level constants (CHARACTER_LIMIT, API_BASE_URL) + +**For Node/TypeScript: Load [TypeScript Implementation Guide](./reference/node_mcp_server.md) and ensure the following:** +- Using `server.registerTool` properly +- Zod schemas with `.strict()` +- TypeScript strict mode enabled +- No `any` types - use proper types +- Explicit Promise return types +- Build process configured (`npm run build`) + +--- + +### Phase 3: Review and Refine + +After initial implementation: + +#### 3.1 Code Quality Review + +To ensure quality, review the code for: +- **DRY Principle**: No duplicated code between tools +- **Composability**: Shared logic extracted into functions +- **Consistency**: Similar operations return similar formats +- **Error Handling**: All external calls have error handling +- **Type Safety**: Full type coverage (Python type hints, TypeScript types) +- **Documentation**: Every tool has comprehensive docstrings/descriptions + +#### 3.2 Test and Build + +**Important:** MCP servers are long-running processes that wait for requests over stdio/stdin or sse/http. Running them directly in your main process (e.g., `python server.py` or `node dist/index.js`) will cause your process to hang indefinitely. + +**Safe ways to test the server:** +- Use the evaluation harness (see Phase 4) - recommended approach +- Run the server in tmux to keep it outside your main process +- Use a timeout when testing: `timeout 5s python server.py` + +**For Python:** +- Verify Python syntax: `python -m py_compile your_server.py` +- Check imports work correctly by reviewing the file +- To manually test: Run server in tmux, then test with evaluation harness in main process +- Or use the evaluation harness directly (it manages the server for stdio transport) + +**For Node/TypeScript:** +- Run `npm run build` and ensure it completes without errors +- Verify dist/index.js is created +- To manually test: Run server in tmux, then test with evaluation harness in main process +- Or use the evaluation harness directly (it manages the server for stdio transport) + +#### 3.3 Use Quality Checklist + +To verify implementation quality, load the appropriate checklist from the language-specific guide: +- Python: see "Quality Checklist" in [Python Guide](./reference/python_mcp_server.md) +- Node/TypeScript: see "Quality Checklist" in [TypeScript Guide](./reference/node_mcp_server.md) + +--- + +### Phase 4: Create Evaluations + +After implementing your MCP server, create comprehensive evaluations to test its effectiveness. + +**Load [Evaluation Guide](./reference/evaluation.md) for complete evaluation guidelines.** + +#### 4.1 Understand Evaluation Purpose + +Evaluations test whether LLMs can effectively use your MCP server to answer realistic, complex questions. + +#### 4.2 Create 10 Evaluation Questions + +To create effective evaluations, follow the process outlined in the evaluation guide: + +1. **Tool Inspection**: List available tools and understand their capabilities +2. **Content Exploration**: Use READ-ONLY operations to explore available data +3. **Question Generation**: Create 10 complex, realistic questions +4. **Answer Verification**: Solve each question yourself to verify answers + +#### 4.3 Evaluation Requirements + +Each question must be: +- **Independent**: Not dependent on other questions +- **Read-only**: Only non-destructive operations required +- **Complex**: Requiring multiple tool calls and deep exploration +- **Realistic**: Based on real use cases humans would care about +- **Verifiable**: Single, clear answer that can be verified by string comparison +- **Stable**: Answer won't change over time + +#### 4.4 Output Format + +Create an XML file with this structure: + +```xml + + + Find discussions about AI model launches with animal codenames. One model needed a specific safety designation that uses the format ASL-X. What number X was being determined for the model named after a spotted wild cat? + 3 + + + +``` + +--- + +# Reference Files + +## Documentation Library + +Load these resources as needed during development: + +### Core MCP Documentation (Load First) +- **MCP Protocol**: Fetch from `https://modelcontextprotocol.io/llms-full.txt` - Complete MCP specification +- [MCP Best Practices](./reference/mcp_best_practices.md) - Universal MCP guidelines including: + - Server and tool naming conventions + - Response format guidelines (JSON vs Markdown) + - Pagination best practices + - Character limits and truncation strategies + - Tool development guidelines + - Security and error handling standards + +### SDK Documentation (Load During Phase 1/2) +- **Python SDK**: Fetch from `https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md` +- **TypeScript SDK**: Fetch from `https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md` + +### Language-Specific Implementation Guides (Load During Phase 2) +- [Python Implementation Guide](./reference/python_mcp_server.md) - Complete Python/FastMCP guide with: + - Server initialization patterns + - Pydantic model examples + - Tool registration with `@mcp.tool` + - Complete working examples + - Quality checklist + +- [TypeScript Implementation Guide](./reference/node_mcp_server.md) - Complete TypeScript guide with: + - Project structure + - Zod schema patterns + - Tool registration with `server.registerTool` + - Complete working examples + - Quality checklist + +### Evaluation Guide (Load During Phase 4) +- [Evaluation Guide](./reference/evaluation.md) - Complete evaluation creation guide with: + - Question creation guidelines + - Answer verification strategies + - XML format specifications + - Example questions and answers + - Running an evaluation with the provided scripts diff --git a/searchnews/README.md b/searchnews/README.md new file mode 100644 index 0000000..be448ec --- /dev/null +++ b/searchnews/README.md @@ -0,0 +1,21 @@ +# Search News + +AI 新闻搜索整理技能,从多个新闻源抓取 AI 相关新闻。 + +## 依赖 + +```bash +brew install jq # macOS 通常已预装 +``` + +## 新闻源 + +- AIBase 日报 +- IT 之家 +- 36氪 +- 机器之心 +- 量子位 + +## 使用 + +加载 skill 后,告诉 Agent 要搜索的日期即可,输出到 `dailynews/YYYY-MM-DD/` 目录。 diff --git a/searchnews/SKILL.md b/searchnews/SKILL.md new file mode 100644 index 0000000..efc37ef --- /dev/null +++ b/searchnews/SKILL.md @@ -0,0 +1,376 @@ +--- +name: searchnews +description: 当用户要求"搜索新闻"、"查询AI新闻"、"整理新闻"、"获取某天的新闻",或提到需要搜索、整理、汇总指定日期的AI行业新闻时,应使用此技能。 +metadata: + version: "0.4.0" +--- + +# AI新闻搜索技能 (Ralph Loop 增强版) + +## 概述 + +此技能用于从多个AI新闻源精确搜索指定日期的新闻,采用 Ralph Loop 模式进行地毯式迭代,确保不留死角。 + +## 核心机制 (Ralph Loop) + +### 1. 任务清单 (prd.json) +记录待爬取的源网站及其状态。 +- `date`: 目标日期,格式 YYYY-MM-DD +- `keywords_ref`: 引用关键词库文件路径(如 `references/keywords.md`),搜索时加载 10 大分类和 100+ 标签进行筛选 +- `sources`: 每个源包含 `name`, `url`, `status` (pending/done/failed), `retry_count` (max 3) + +### 2. 退出逻辑 (目标导向) +- **成功退出**:当所有 `sources` 状态均为 `done` 时,输出 `COMPLETE` 并立即停止循环。 +- **失败容错**:单个源抓取失败时,最多尝试 **3次**。若 3 次均失败,将状态标记为 `failed`,记录失败原因,跳过该源。 +- **高效收工**:一旦所有源都处理完毕(状态为 `done` 或 `failed`),立即生成最终日报并交付,不强制跑完预设的最大轮次。 + +## 必抓源列表(按优先级排序) + +| 优先级 | 源名称 | URL | 说明 | +|--------|--------|-----|------| +| **高** | AIBase日报 | https://news.aibase.com/zh/daily | 每日AI新闻汇总,必抓!内容精炼、覆盖全面 | +| 中 | IT之家AI频道 | https://next.ithome.com/ | 国内科技资讯,AI专栏 | +| 中 | 36氪AI频道 | https://36kr.com/information/AI/ | 创投视角,AI产业报道 | +| 中 | 机器之心 | https://www.jiqizhixin.com/articles | 专业AI媒体,技术深度 | +| 中 | 量子位 | https://www.qbitai.com | AI前沿,产品报道 | + +> **注意**:AIBase日报通常在当天发布,内容即为当天新闻汇总,是最高效的信息源。 + +## 工作流程 + +### ⚠️ 铁律:必须使用 Ralph 脚本启动! + +**禁止手动乱抓!** 必须严格按以下流程执行: + +```bash +# 第零步:启动 Ralph Loop(必须执行!) +bash .opencode/skills/searchnews/scripts/ralph/ralph.sh 2026-01-19 +``` + +脚本会自动初始化 `prd.json`,然后 Agent 按任务清单逐个处理。 + +### 第一步:初始化任务清单(由脚本完成) +脚本会在 `.opencode/skills/searchnews/scripts/ralph/prd.json` 中生成源网站列表,初始状态均为 `pending`。**AIBase日报必须放在第一位,优先抓取。** + +### 第二步:地毯式循环搜索 +1. 读取 `prd.json` 中处于 `pending` 状态的源。 +2. **每处理一个源,必须更新 prd.json 状态**(pending → done/failed)。 +3. **每轮迭代必须写入 progress.txt**,记录进度和失败原因。 +4. 严格校验日期,仅保留目标日期的内容。 +5. 抓取失败时 `retry_count + 1`,最多重试3次。 + +### 第二点五步:深度检索(重要!) +**禁止只抓列表页!** 对于筛选出的重要新闻,必须深入到详情页抓取: +1. 从列表页提取新闻详情 URL +2. **逐条访问详情页**,获取完整内容 +3. 提取关键信息: + - 完整正文(不是摘要) + - 技术细节、数据指标 + - 原始来源/论文链接 + - 划重点/要点总结 +4. 深度检索的新闻质量远高于列表页复制粘贴 + +> **示例**:AIBase日报列表页只有标题和简介,但详情页有完整的技术解读、数据对比、划重点等深度内容。 + +### 第三步:去重与聚合 +合并不同来源的相同新闻,保留详情最丰富的版本,合并标注来源。 + +### 第四步:输出结构化文档 +文件存储在 `dailynews/YYYY-MM-DD/YYYY-MM-DD.md`(每日独立文件夹)。 + +#### 输出格式模板(必须严格遵守!) + +```markdown +--- +date: YYYY-MM-DD +type: 新闻日报 +tags: [AI新闻, 日报] +--- + +# AI新闻日报 - YYYY-MM-DD + +> 日期校验: 已通过 | 仅包含YYYY-MM-DD发布的新闻 | 已去重 + +--- + +## 1. 新闻标题 + +**分类**: 分类标签 | **来源**: 来源网站 | **时间**: YYYY/M/D HH:MM + +一句话摘要,概括新闻核心内容。 + +**详情**: +- 详情要点1(包含具体数据、指标) +- 详情要点2 +- 详情要点3 +- 详情要点4(可选) +- 详情要点5(可选) + +--- + +## 2. 下一条新闻标题 +... + +*数据来源: 来源列表 | 整理时间: YYYY-MM-DD* +``` + +#### 格式要点 +1. **每条新闻必须包含**:编号标题、分类|来源|时间、摘要、详情要点(3-5条) +2. **详情要点必须包含具体数据**:金额、百分比、时间节点、技术指标等 +3. **分类标签参考**:AI基础设施、AI产品、投融资、机器人、商业化、AI监管、行业观点、企业战略、AI能力、趣闻等 +4. **时间格式**:精确到分钟(如 2026/1/19 15:28) +5. **新闻数量要求**:每日至少整理 10-20 条新闻,不得偷懒只抓几条! + +### 第五步:确认完成 +当所有源状态均为 `done` 或 `failed` 时,输出: +``` +COMPLETE +``` + +## 质量要求 + +- [ ] **必用脚本**:必须先执行 `ralph.sh` 初始化,禁止手动乱抓! +- [ ] **状态追踪**:每处理一个源必须更新 `prd.json` 状态。 +- [ ] **进度记录**:每轮迭代必须写入 `progress.txt`。 +- [ ] **必抓AIBase**:AIBase日报是必抓源,每次整理新闻必须首先访问。 +- [ ] **深度检索**:禁止只抓列表页!重要新闻必须深入详情页获取完整内容。 +- [ ] **全量覆盖**:必须尝试清单中所有的源网站。 +- [ ] **日期铁律**:严禁混入非目标日期的新闻。 +- [ ] **标签映射**:必须对照 10 大分类进行精准打标。 +- [ ] **详情完整**:包含标题、摘要、3-5条详情要点、溯源链接、精确时间。 +- [ ] **循环退出**:所有源 done/failed 后才输出 `COMPLETE`。 + +## ⛔ 输出铁律(违反即解雇!) + +### 被剔除的新闻禁止输出! +1. **只输出符合日期的新闻**:最终日报中只能出现目标日期的新闻 +2. **剔除的不要提**:因日期不符被剔除的新闻,**禁止在任何地方输出或提及** +3. **不要显示剔除过程**:不要告诉用户"我剔除了 xx 条"、"以下是被过滤的"等废话 +4. **静默过滤**:日期校验是内部逻辑,用户只需要看到最终结果,不需要知道你筛掉了什么 +5. **简洁交付**:只输出干净的、符合日期的新闻列表,没有任何多余说明 + +**错误示例(禁止!)**: +``` +以下新闻因日期不符已剔除: +- xxx(1月18日) +- yyy(1月20日) +``` + +**正确做法**: +静默跳过不符合日期的新闻,只输出符合的,一个字都不要多说。 + +## 资源引用 + +- **scripts/ralph/ralph.sh** - 启动主循环。 +- **scripts/ralph/prd.json** - 动态任务清单。 +- **scripts/ralph/progress.txt** - 迭代进度与重试日志。 +- **references/keywords.md** - 10 大分类 100+ 标签地图。 +- **templates/** - 视频风格模板库。 + +--- + +## 第六步:生成新闻视频(可选) + +新闻日报整理完成后,可生成AI新闻视频。 + +### 6.1 交互流程(必须询问!) + +收到"生成新闻视频"请求后,**必须依次询问**: + +#### 问题1:确认日期 +``` +生成哪天的新闻视频? +- 今天 (YYYY-MM-DD) +- 昨天 (YYYY-MM-DD) +- 自定义日期 +``` + +#### 问题2:是否使用风格模板 +``` +是否使用提示词库中的风格模板? +- 是,使用模板 (推荐) - 从21种预设风格中选择,风格统一 +- 否,自由生成 - 不使用模板,AI自由发挥 +``` + +**如果选择"使用模板",继续问题3;否则跳到问题4** + +#### 问题3:选择视觉风格(21种) + +**风格提示词库位置**:`{prompts_dir}/图片生成风格/AI新闻早报风格/` + +``` +选择配图风格: + +【科技感】 +- 默认风格-Dashboard (推荐) - 科技仪表盘,数据可视化 +- 赛博未来风 - 霓虹赛博朋克 +- 科技媒体封面风 - 新闻媒体封面感 +- AI操作系统界面风 - JARVIS控制台风格 +- 深色金融终端风 - Bloomberg终端感 +- 全息投影风 - 全息科幻 +- 量子科幻风 - 量子粒子效果 + +【简约风】 +- 毛玻璃拟态风 - 苹果风毛玻璃 +- 信息图表风 - 数据信息图 +- 极简信息设计风 - 扁平极简 + +【特色风】 +- 未来报纸头版 - 报纸版式 +- 杂志封面风 - 杂志风格 +- 漫画分镜风 - 漫画格子 +- 太空宇宙风 - 星空宇宙 +- 水墨国风 - 中国风水墨 +- 复古像素风 - 8bit像素 +- 霓虹波普风 - 波普艺术 +- 工程蓝图风 - 技术蓝图 +- 自然有机风 - 环保自然 +- 未来实验室风 - 实验室科研 +- 社交媒体爆款风 - 抖音小红书 +``` + +#### 问题4:生成模式 +``` +选择生成模式: +- 完整版(总览+详情)(推荐) - 1张总览图 + N张详情图 +- 仅总览 - 只生成1张总览图 +- 仅详情 - 只生成N张详情图 +``` + +#### 问题5:新闻数量(如果超过10条) +``` +日报共有XX条新闻,如何处理? +- 全部生成 +- 精选10条 - 自动挑选最重要的 +- 精选5条 - 只做头条 +``` + +### 6.2 加载并使用风格模板 + +#### 步骤1:读取模板文件 +``` +{prompts_dir}/图片生成风格/AI新闻早报风格/{风格名}.md +``` + +#### 步骤2:提取"完整提示词模板" +每个风格文件都包含 `## 完整提示词模板` 段落,提取其中的提示词。 + +#### 步骤3:替换变量 +| 变量 | 替换内容 | 示例 | +|------|----------|------| +| `{日期}` | 日报日期 | 2026年01月23日 | +| `{N}` | 新闻条数 | 25 | +| `{新闻列表}` | 编号+标题列表 | 1. ChatGPT Atlas更新... | + +#### 步骤4:生成总览图 +用替换后的提示词调用 image-service: +```bash +python .opencode/skills/image-service/scripts/text_to_image.py \ + "{替换变量后的完整提示词}" -r 16:9 -o "assets/video/{日期}/00_overview.png" +``` + +#### 步骤5:生成详情图 +每条新闻单独生成,提示词结构: +``` +AI新闻详情配图 - {风格名} + +【新闻标题】{标题} + +【新闻要点】 +- {要点1} +- {要点2} +- {要点3} + +【视觉要求】 +- 沿用{风格名}的视觉风格 +- 中心突出新闻主题的3D/扁平化插图 +- 标题大字清晰,要点用图标化卡片展示 +- 底部水印:{your_watermark} + +输出尺寸:2560x1440 横版 16:9 +``` + +### 6.3 视频生成流程 + +#### 目录结构 +``` +assets/video/{YYYY-MM-DD}/ +├── 00_overview.png # 总览图 +├── 01_xxx.png # 详情图1 +├── 02_xxx.png # 详情图2 +├── ... +├── audio/ +│ ├── 00_overview.mp3 # 总览配音 +│ ├── 01.mp3 # 详情配音1 +│ └── ... +├── video.yaml # 合成配置 +└── {日期}_ai_news.mp4 # 最终视频 +``` + +#### 生成命令 + +```bash +# 1. 创建目录 +mkdir -p "assets/video/{日期}/audio" + +# 2. 并发生成配图(使用 text_to_image) +python .opencode/skills/image-service/scripts/text_to_image.py \ + "{风格提示词}" -r 16:9 -o "assets/video/{日期}/00_overview.png" + +# 3. 并发生成配音 +python .opencode/skills/video-creator/scripts/tts_generator.py \ + --text "{配音文本}" \ + --voice zh-CN-YunyangNeural \ + --output "assets/video/{日期}/audio/XX.mp3" + +# 4. 合成视频 +python .opencode/skills/video-creator/scripts/video_maker.py \ + assets/video/{日期}/video.yaml +``` + +### 6.4 配音规范 + +| 场景 | 文本模板 | +|------|----------| +| 总览 | "AI早报,{日期}。今天共有{N}条AI行业重磅新闻,让我们一起来看看!" | +| 详情 | "第X条,{标题}。{摘要}" | +| 结尾 | 最后一条追加"以上就是今天的AI早报,感谢收看!" | + +**音色选择**: +- `zh-CN-YunyangNeural` - 男声,新闻播报(推荐) +- `zh-CN-YunxiNeural` - 男声,阳光活泼 +- `zh-CN-XiaoxiaoNeural` - 女声,温暖自然 + +### 6.5 视频配置模板 (video.yaml) + +```yaml +output: {YYYY-MM-DD}_ai_news.mp4 + +scenes: + - image: 00_overview.png + audio: audio/00_overview.mp3 + - image: 01_xxx.png + audio: audio/01.mp3 + # ... 依次列出所有场景 +``` + +### 6.6 完成后输出 + +``` +✅ 视频生成完成! + +📍 位置:assets/video/{日期}/{日期}_ai_news.mp4 +⏱️ 时长:X分X秒 +🎬 场景:X个(1总览 + X详情) +🎨 风格:{选择的风格} + +是否打开预览? +``` + +### 6.7 注意事项 + +1. **并发生成**:配图和配音都要并发,提升效率 +2. **水印**:所有配图底部必须添加水印 +3. **片尾**:视频自动拼接通用片尾 +4. **BGM**:自动添加科技风背景音乐 +5. **比例**:所有配图使用 16:9 横版 diff --git a/searchnews/references/keywords.md b/searchnews/references/keywords.md new file mode 100644 index 0000000..78cdf7e --- /dev/null +++ b/searchnews/references/keywords.md @@ -0,0 +1,31 @@ +# AI 全维度关键词库 (Ralph Loop 搜索基准) + +## 一、基础 & 通用 AI 标签 +#AI #人工智能 #智能科技 #前沿科技 #未来科技 #数字智能 #智能时代 #科技趋势 #智能革命 #下一代科技 + +## 二、大模型 / 底层能力 +#大模型 #基础模型 #通用人工智能 #AGI #多模态 #语言模型 #视觉模型 #生成模型 #模型训练 #模型推理 + +## 三、生成式 AI +#生成式AI #AIGC #AI绘画 #AI写作 #AI视频 #AI设计 #AI作曲 #AI配音 #AI图像生成 #AI内容创作 + +## 四、智能体 / Agent 体系 +#智能体 #AI智能体 #Agent #多智能体 #AI自动化 #任务型智能体 #自主智能体 #工具调用 #AI协作 #AI执行引擎 + +## 五、提示词 & 人机交互 +#提示词 #Prompt #Prompt工程 #提示词设计 #人机交互 #自然语言交互 #对话式AI #指令工程 #AI沟通方式 #AI思维 + +## 六、AI 工程 / 开发 / 技术向 +#AI工程 #AI开发 #模型部署 #模型微调 #RAG #向量数据库 #AI架构 #AI系统设计 #AI中台 #AI产品化 + +## 七、AI 产品 & 应用落地 +#AI产品 #AI应用 #AI助手 #AI工具 #智能办公 #AI营销 #AI客服 #AI教育 #AI医疗 #AI商业化 + +## 八、趋势 / 认知 / 思想层 +#AI趋势 #AI认知升级 #AI时代 #AI变革 #AI生产力 #AI替代 #AI赋能 #人与AI #智能社会 #技术浪潮 + +## 九、内容传播 & 平台友好标签 +#科技科普 #硬核科技 #科技博主 #科技认知 #效率工具 #认知提升 #工具推荐 #生产力工具 #数字生活 #未来职业 + +## 十、偏前沿 & 概念向 +#数字生命 #虚拟智能 #AI意识 #机器智能 #智能进化 #人机共生 #智能文明 #类人智能 #AI哲学 #未来已来 diff --git a/searchnews/scripts/ralph/prd.json b/searchnews/scripts/ralph/prd.json new file mode 100644 index 0000000..539c9b7 --- /dev/null +++ b/searchnews/scripts/ralph/prd.json @@ -0,0 +1,39 @@ +{ + "date": "2026-01-25", + "keywords_ref": "references/keywords.md", + "sources": [ + { + "name": "AIBase日报", + "url": "https://news.aibase.com/zh/daily", + "status": "failed", + "priority": "high", + "note": "1月25日日报尚未发布" + }, + { + "name": "IT之家智能时代", + "url": "https://next.ithome.com/", + "status": "done", + "news_count": 10 + }, + { + "name": "36氪", + "url": "https://36kr.com/information/AI/", + "status": "done", + "news_count": 0, + "note": "列表页无1月25日新闻" + }, + { + "name": "机器之心", + "url": "https://www.jiqizhixin.com/articles", + "status": "done", + "news_count": 0 + }, + { + "name": "量子位", + "url": "https://www.qbitai.com", + "status": "done", + "news_count": 3 + } + ], + "is_complete": true +} diff --git a/searchnews/scripts/ralph/prd.template.json b/searchnews/scripts/ralph/prd.template.json new file mode 100644 index 0000000..def116f --- /dev/null +++ b/searchnews/scripts/ralph/prd.template.json @@ -0,0 +1,12 @@ +{ + "date": "YYYY-MM-DD", + "keywords_ref": "references/keywords.md", + "sources": [ + {"name": "AIBase日报", "url": "https://news.aibase.com/zh/daily", "status": "pending", "priority": "high"}, + {"name": "IT之家智能时代", "url": "https://next.ithome.com/", "status": "pending"}, + {"name": "36氪", "url": "https://36kr.com/information/AI/", "status": "pending"}, + {"name": "机器之心", "url": "https://www.jiqizhixin.com/articles", "status": "pending"}, + {"name": "量子位", "url": "https://www.qbitai.com", "status": "pending"} + ], + "is_complete": false +} diff --git a/searchnews/scripts/ralph/progress.txt b/searchnews/scripts/ralph/progress.txt new file mode 100644 index 0000000..14363bf --- /dev/null +++ b/searchnews/scripts/ralph/progress.txt @@ -0,0 +1,16 @@ +Ralph Loop Progress - 2026-01-24 +================================ + +[Round 1] 2026-01-24 15:xx +- AIBase日报: done (最新日报为1月23日发布,提取相关内容) +- IT之家: done (获取10+条1月24日新闻) +- 36氪: done (获取多条AI相关新闻) +- 机器之心: done (页面加载成功) +- 量子位: done (获取多条热门新闻) + +[Summary] +- 所有源抓取完成 +- 共整理17条新闻 +- 日报已生成: dailynews/2026-01-24/news.md + +COMPLETE diff --git a/searchnews/scripts/ralph/ralph.sh b/searchnews/scripts/ralph/ralph.sh new file mode 100644 index 0000000..806aa96 --- /dev/null +++ b/searchnews/scripts/ralph/ralph.sh @@ -0,0 +1,21 @@ +#!/bin/bash +# Ralph Loop 启动脚本 - 仅初始化,不循环 + +DATE=${1:-$(date +%Y-%m-%d)} +PRD_FILE=".opencode/skills/searchnews/scripts/ralph/prd.json" +PRD_TEMPLATE=".opencode/skills/searchnews/scripts/ralph/prd.template.json" + +# 从模板创建/重置 prd.json +if [ -f "$PRD_TEMPLATE" ]; then + jq --arg date "$DATE" '.date = $date | .sources[].status = "pending" | .is_complete = false' "$PRD_TEMPLATE" > "$PRD_FILE" + echo "Initialized prd.json for $DATE" +else + echo "Error: Template not found!" + exit 1 +fi + +echo "" +echo "Sources to crawl:" +jq -r '.sources[] | " [\(.priority // "normal")] \(.name): \(.url)"' "$PRD_FILE" +echo "" +echo "Ready! Agent will now process each source and update prd.json status." diff --git a/searchnews/templates/README.md b/searchnews/templates/README.md new file mode 100644 index 0000000..654c6e9 --- /dev/null +++ b/searchnews/templates/README.md @@ -0,0 +1,50 @@ +# 新闻视频模板库 + +## 可用模板 + +| 模板名称 | 文件 | 风格描述 | 适用场景 | +|---------|------|----------|----------| +| 黑板报粉笔风 | `blackboard_chalk.png` | 深绿黑板背景、彩色粉笔手绘、温馨有趣 | 小红书、抖音、日常分享 | + +## 使用方法 + +生成视频时,Agent 会询问选择哪个模板,然后基于模板风格生成所有新闻卡片。 + +## 添加新模板 + +1. 将模板图片放入此目录 +2. 更新此 README 的模板列表 +3. 在 SKILL.md 中添加模板的提示词描述 + +## 模板提示词参考 + +### blackboard_chalk(黑板报粉笔风) + +``` +手绘黑板报风格AI新闻卡片,3:4竖版。深绿色黑板背景带粉笔质感。 +左上角{颜色}粉笔标签'{分类}'。 +白色粉笔大标题'{标题}'。 +用彩色粉笔手绘可视化:{根据内容描述简笔画图标}。 +要点用黄色粉笔:{要点1}、{要点2}、{要点3}、{要点4}。 +右下角小字'{来源}' +``` + +### 封面模板 + +``` +手绘黑板报风格AI新闻日报封面,3:4竖版。深绿色黑板背景带粉笔质感。 +顶部用超大白色粉笔手写'AI日报',下方黄色粉笔写'{日期}'。 +中间用白色粉笔整齐列出所有新闻标题(手写风格): +{新闻标题列表} +周围点缀粉笔星星和小装饰 +``` + +### 结尾模板 + +``` +手绘黑板报风格AI新闻日报结尾页,3:4竖版。深绿色黑板背景粉笔质感。 +中间用超大白色粉笔手写'今日份AI已送达'。 +下方用黄色粉笔画一个可爱的机器人挥手。 +用粉色粉笔写'点赞+收藏+关注',旁边画爱心、星星、加号图标。 +底部用蓝色粉笔写'明天见!',周围点缀彩色粉笔星星和小花装饰。 +``` diff --git a/searchnews/templates/blackboard_chalk.png b/searchnews/templates/blackboard_chalk.png new file mode 100644 index 0000000..ffef82b Binary files /dev/null and b/searchnews/templates/blackboard_chalk.png differ diff --git a/skill-creator/SKILL.md b/skill-creator/SKILL.md new file mode 100644 index 0000000..4069935 --- /dev/null +++ b/skill-creator/SKILL.md @@ -0,0 +1,209 @@ +--- +name: skill-creator +description: Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations. +license: Complete terms in LICENSE.txt +--- + +# Skill Creator + +This skill provides guidance for creating effective skills. + +## About Skills + +Skills are modular, self-contained packages that extend Claude's capabilities by providing +specialized knowledge, workflows, and tools. Think of them as "onboarding guides" for specific +domains or tasks—they transform Claude from a general-purpose agent into a specialized agent +equipped with procedural knowledge that no model can fully possess. + +### What Skills Provide + +1. Specialized workflows - Multi-step procedures for specific domains +2. Tool integrations - Instructions for working with specific file formats or APIs +3. Domain expertise - Company-specific knowledge, schemas, business logic +4. Bundled resources - Scripts, references, and assets for complex and repetitive tasks + +### Anatomy of a Skill + +Every skill consists of a required SKILL.md file and optional bundled resources: + +``` +skill-name/ +├── SKILL.md (required) +│ ├── YAML frontmatter metadata (required) +│ │ ├── name: (required) +│ │ └── description: (required) +│ └── Markdown instructions (required) +└── Bundled Resources (optional) + ├── scripts/ - Executable code (Python/Bash/etc.) + ├── references/ - Documentation intended to be loaded into context as needed + └── assets/ - Files used in output (templates, icons, fonts, etc.) +``` + +#### SKILL.md (required) + +**Metadata Quality:** The `name` and `description` in YAML frontmatter determine when Claude will use the skill. Be specific about what the skill does and when to use it. Use the third-person (e.g. "This skill should be used when..." instead of "Use this skill when..."). + +#### Bundled Resources (optional) + +##### Scripts (`scripts/`) + +Executable code (Python/Bash/etc.) for tasks that require deterministic reliability or are repeatedly rewritten. + +- **When to include**: When the same code is being rewritten repeatedly or deterministic reliability is needed +- **Example**: `scripts/rotate_pdf.py` for PDF rotation tasks +- **Benefits**: Token efficient, deterministic, may be executed without loading into context +- **Note**: Scripts may still need to be read by Claude for patching or environment-specific adjustments + +##### References (`references/`) + +Documentation and reference material intended to be loaded as needed into context to inform Claude's process and thinking. + +- **When to include**: For documentation that Claude should reference while working +- **Examples**: `references/finance.md` for financial schemas, `references/mnda.md` for company NDA template, `references/policies.md` for company policies, `references/api_docs.md` for API specifications +- **Use cases**: Database schemas, API documentation, domain knowledge, company policies, detailed workflow guides +- **Benefits**: Keeps SKILL.md lean, loaded only when Claude determines it's needed +- **Best practice**: If files are large (>10k words), include grep search patterns in SKILL.md +- **Avoid duplication**: Information should live in either SKILL.md or references files, not both. Prefer references files for detailed information unless it's truly core to the skill—this keeps SKILL.md lean while making information discoverable without hogging the context window. Keep only essential procedural instructions and workflow guidance in SKILL.md; move detailed reference material, schemas, and examples to references files. + +##### Assets (`assets/`) + +Files not intended to be loaded into context, but rather used within the output Claude produces. + +- **When to include**: When the skill needs files that will be used in the final output +- **Examples**: `assets/logo.png` for brand assets, `assets/slides.pptx` for PowerPoint templates, `assets/frontend-template/` for HTML/React boilerplate, `assets/font.ttf` for typography +- **Use cases**: Templates, images, icons, boilerplate code, fonts, sample documents that get copied or modified +- **Benefits**: Separates output resources from documentation, enables Claude to use files without loading them into context + +### Progressive Disclosure Design Principle + +Skills use a three-level loading system to manage context efficiently: + +1. **Metadata (name + description)** - Always in context (~100 words) +2. **SKILL.md body** - When skill triggers (<5k words) +3. **Bundled resources** - As needed by Claude (Unlimited*) + +*Unlimited because scripts can be executed without reading into context window. + +## Skill Creation Process + +To create a skill, follow the "Skill Creation Process" in order, skipping steps only if there is a clear reason why they are not applicable. + +### Step 1: Understanding the Skill with Concrete Examples + +Skip this step only when the skill's usage patterns are already clearly understood. It remains valuable even when working with an existing skill. + +To create an effective skill, clearly understand concrete examples of how the skill will be used. This understanding can come from either direct user examples or generated examples that are validated with user feedback. + +For example, when building an image-editor skill, relevant questions include: + +- "What functionality should the image-editor skill support? Editing, rotating, anything else?" +- "Can you give some examples of how this skill would be used?" +- "I can imagine users asking for things like 'Remove the red-eye from this image' or 'Rotate this image'. Are there other ways you imagine this skill being used?" +- "What would a user say that should trigger this skill?" + +To avoid overwhelming users, avoid asking too many questions in a single message. Start with the most important questions and follow up as needed for better effectiveness. + +Conclude this step when there is a clear sense of the functionality the skill should support. + +### Step 2: Planning the Reusable Skill Contents + +To turn concrete examples into an effective skill, analyze each example by: + +1. Considering how to execute on the example from scratch +2. Identifying what scripts, references, and assets would be helpful when executing these workflows repeatedly + +Example: When building a `pdf-editor` skill to handle queries like "Help me rotate this PDF," the analysis shows: + +1. Rotating a PDF requires re-writing the same code each time +2. A `scripts/rotate_pdf.py` script would be helpful to store in the skill + +Example: When designing a `frontend-webapp-builder` skill for queries like "Build me a todo app" or "Build me a dashboard to track my steps," the analysis shows: + +1. Writing a frontend webapp requires the same boilerplate HTML/React each time +2. An `assets/hello-world/` template containing the boilerplate HTML/React project files would be helpful to store in the skill + +Example: When building a `big-query` skill to handle queries like "How many users have logged in today?" the analysis shows: + +1. Querying BigQuery requires re-discovering the table schemas and relationships each time +2. A `references/schema.md` file documenting the table schemas would be helpful to store in the skill + +To establish the skill's contents, analyze each concrete example to create a list of the reusable resources to include: scripts, references, and assets. + +### Step 3: Initializing the Skill + +At this point, it is time to actually create the skill. + +Skip this step only if the skill being developed already exists, and iteration or packaging is needed. In this case, continue to the next step. + +When creating a new skill from scratch, always run the `init_skill.py` script. The script conveniently generates a new template skill directory that automatically includes everything a skill requires, making the skill creation process much more efficient and reliable. + +Usage: + +```bash +scripts/init_skill.py --path +``` + +The script: + +- Creates the skill directory at the specified path +- Generates a SKILL.md template with proper frontmatter and TODO placeholders +- Creates example resource directories: `scripts/`, `references/`, and `assets/` +- Adds example files in each directory that can be customized or deleted + +After initialization, customize or remove the generated SKILL.md and example files as needed. + +### Step 4: Edit the Skill + +When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of Claude to use. Focus on including information that would be beneficial and non-obvious to Claude. Consider what procedural knowledge, domain-specific details, or reusable assets would help another Claude instance execute these tasks more effectively. + +#### Start with Reusable Skill Contents + +To begin implementation, start with the reusable resources identified above: `scripts/`, `references/`, and `assets/` files. Note that this step may require user input. For example, when implementing a `brand-guidelines` skill, the user may need to provide brand assets or templates to store in `assets/`, or documentation to store in `references/`. + +Also, delete any example files and directories not needed for the skill. The initialization script creates example files in `scripts/`, `references/`, and `assets/` to demonstrate structure, but most skills won't need all of them. + +#### Update SKILL.md + +**Writing Style:** Write the entire skill using **imperative/infinitive form** (verb-first instructions), not second person. Use objective, instructional language (e.g., "To accomplish X, do Y" rather than "You should do X" or "If you need to do X"). This maintains consistency and clarity for AI consumption. + +To complete SKILL.md, answer the following questions: + +1. What is the purpose of the skill, in a few sentences? +2. When should the skill be used? +3. In practice, how should Claude use the skill? All reusable skill contents developed above should be referenced so that Claude knows how to use them. + +### Step 5: Packaging a Skill + +Once the skill is ready, it should be packaged into a distributable zip file that gets shared with the user. The packaging process automatically validates the skill first to ensure it meets all requirements: + +```bash +scripts/package_skill.py +``` + +Optional output directory specification: + +```bash +scripts/package_skill.py ./dist +``` + +The packaging script will: + +1. **Validate** the skill automatically, checking: + - YAML frontmatter format and required fields + - Skill naming conventions and directory structure + - Description completeness and quality + - File organization and resource references + +2. **Package** the skill if validation passes, creating a zip file named after the skill (e.g., `my-skill.zip`) that includes all files and maintains the proper directory structure for distribution. + +If validation fails, the script will report the errors and exit without creating a package. Fix any validation errors and run the packaging command again. + +### Step 6: Iterate + +After testing the skill, users may request improvements. Often this happens right after using the skill, with fresh context of how the skill performed. + +**Iteration workflow:** +1. Use the skill on real tasks +2. Notice struggles or inefficiencies +3. Identify how SKILL.md or bundled resources should be updated +4. Implement changes and test again diff --git a/skill-creator/scripts/init_skill.py b/skill-creator/scripts/init_skill.py new file mode 100644 index 0000000..af22ef1 --- /dev/null +++ b/skill-creator/scripts/init_skill.py @@ -0,0 +1,108 @@ +#!/usr/bin/env python3 +""" +Skill Initialization Script + +Creates a new skill directory with the proper structure and template files. + +Usage: + python init_skill.py --path + +Example: + python init_skill.py my-awesome-skill --path ~/.opencode/skills/ +""" + +import argparse +import os +from pathlib import Path + + +SKILL_TEMPLATE = '''--- +name: {skill_name} +description: TODO: Add a clear description of what this skill does and when it should be used. Use third-person (e.g., "This skill should be used when...") +--- + +# {skill_title} + +TODO: Add the main content of your skill here. + +## When to Use This Skill + +TODO: Describe the scenarios when this skill should be triggered. + +## How to Use + +TODO: Provide instructions on how to use this skill effectively. + +## References + +TODO: List any reference files in the `references/` directory that Claude should load when needed. + +## Scripts + +TODO: List any scripts in the `scripts/` directory that can be executed. + +## Assets + +TODO: List any assets in the `assets/` directory that are used in output. +''' + + +def create_skill(skill_name: str, output_path: str) -> None: + """Create a new skill directory with template files.""" + + skill_dir = Path(output_path) / skill_name + + if skill_dir.exists(): + print(f"Error: Directory already exists: {skill_dir}") + return + + # Create directory structure + skill_dir.mkdir(parents=True) + (skill_dir / "scripts").mkdir() + (skill_dir / "references").mkdir() + (skill_dir / "assets").mkdir() + + # Create SKILL.md + skill_title = skill_name.replace("-", " ").title() + skill_content = SKILL_TEMPLATE.format( + skill_name=skill_name, + skill_title=skill_title + ) + (skill_dir / "SKILL.md").write_text(skill_content) + + # Create example files + (skill_dir / "scripts" / "example.py").write_text( + '#!/usr/bin/env python3\n"""Example script - delete if not needed."""\n\nprint("Hello from skill!")\n' + ) + (skill_dir / "references" / "example.md").write_text( + "# Example Reference\n\nThis is an example reference file. Delete if not needed.\n" + ) + (skill_dir / "assets" / ".gitkeep").write_text("") + + print(f"Created skill: {skill_dir}") + print(f" - SKILL.md (edit this file)") + print(f" - scripts/ (add executable scripts)") + print(f" - references/ (add reference documentation)") + print(f" - assets/ (add templates, images, etc.)") + + +def main(): + parser = argparse.ArgumentParser( + description="Initialize a new skill directory" + ) + parser.add_argument( + "skill_name", + help="Name of the skill (use kebab-case, e.g., my-awesome-skill)" + ) + parser.add_argument( + "--path", + default=".", + help="Output directory for the skill (default: current directory)" + ) + + args = parser.parse_args() + create_skill(args.skill_name, args.path) + + +if __name__ == "__main__": + main() diff --git a/skill-creator/scripts/package_skill.py b/skill-creator/scripts/package_skill.py new file mode 100644 index 0000000..2c2ec8a --- /dev/null +++ b/skill-creator/scripts/package_skill.py @@ -0,0 +1,138 @@ +#!/usr/bin/env python3 +""" +Skill Packaging Script + +Validates and packages a skill into a distributable zip file. + +Usage: + python package_skill.py [output-directory] + +Example: + python package_skill.py ./my-skill + python package_skill.py ./my-skill ./dist +""" + +import argparse +import os +import re +import sys +import zipfile +from pathlib import Path + + +def validate_skill(skill_path: Path) -> list[str]: + """Validate a skill and return a list of errors.""" + errors = [] + + # Check SKILL.md exists + skill_md = skill_path / "SKILL.md" + if not skill_md.exists(): + errors.append("SKILL.md not found") + return errors + + content = skill_md.read_text() + + # Check YAML frontmatter + if not content.startswith("---"): + errors.append("SKILL.md must start with YAML frontmatter (---)") + return errors + + # Extract frontmatter + parts = content.split("---", 2) + if len(parts) < 3: + errors.append("Invalid YAML frontmatter format") + return errors + + frontmatter = parts[1] + + # Check required fields + if "name:" not in frontmatter: + errors.append("Missing 'name' field in frontmatter") + if "description:" not in frontmatter: + errors.append("Missing 'description' field in frontmatter") + + # Check description quality + if "TODO" in frontmatter: + errors.append("Frontmatter contains TODO placeholders - please complete the description") + + # Check name matches directory + name_match = re.search(r"name:\s*(.+)", frontmatter) + if name_match: + skill_name = name_match.group(1).strip() + if skill_name != skill_path.name: + errors.append(f"Skill name '{skill_name}' doesn't match directory name '{skill_path.name}'") + + # Check body content + body = parts[2] + if "TODO" in body: + errors.append("SKILL.md body contains TODO placeholders - please complete the content") + + return errors + + +def package_skill(skill_path: Path, output_dir: Path) -> Path | None: + """Package a skill into a zip file.""" + + # Validate first + errors = validate_skill(skill_path) + if errors: + print("Validation failed:") + for error in errors: + print(f" - {error}") + return None + + # Create output directory + output_dir.mkdir(parents=True, exist_ok=True) + + # Create zip file + zip_path = output_dir / f"{skill_path.name}.zip" + + with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as zf: + for file_path in skill_path.rglob("*"): + if file_path.is_file(): + # Skip hidden files and __pycache__ + if any(part.startswith(".") or part == "__pycache__" + for part in file_path.parts): + continue + arcname = file_path.relative_to(skill_path.parent) + zf.write(file_path, arcname) + + return zip_path + + +def main(): + parser = argparse.ArgumentParser( + description="Validate and package a skill" + ) + parser.add_argument( + "skill_path", + help="Path to the skill directory" + ) + parser.add_argument( + "output_dir", + nargs="?", + default=".", + help="Output directory for the zip file (default: current directory)" + ) + + args = parser.parse_args() + + skill_path = Path(args.skill_path).resolve() + output_dir = Path(args.output_dir).resolve() + + if not skill_path.is_dir(): + print(f"Error: Not a directory: {skill_path}") + sys.exit(1) + + print(f"Validating skill: {skill_path.name}") + + zip_path = package_skill(skill_path, output_dir) + + if zip_path: + print(f"Successfully packaged: {zip_path}") + else: + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/smart-query/README.md b/smart-query/README.md new file mode 100644 index 0000000..be24272 --- /dev/null +++ b/smart-query/README.md @@ -0,0 +1,37 @@ +# Smart Query + +数据库智能查询技能,支持 SSH 隧道连接。 + +## 依赖 + +```bash +pip install pymysql paramiko sshtunnel +``` + +## 配置 + +编辑 `config/settings.json`,填写数据库连接信息: + +```json +{ + "ssh": { + "host": "跳板机地址", + "port": 22, + "user": "用户名", + "key_path": "~/.ssh/id_rsa" + }, + "database": { + "host": "数据库地址", + "port": 3306, + "user": "数据库用户", + "password": "密码", + "database": "数据库名" + } +} +``` + +## 功能 + +- 执行 SQL 查询 +- 自然语言转 SQL +- 生成表结构文档 diff --git a/smart-query/SKILL.md b/smart-query/SKILL.md new file mode 100644 index 0000000..fa265f9 --- /dev/null +++ b/smart-query/SKILL.md @@ -0,0 +1,108 @@ +--- +name: smart-query +description: 智能数据库查询技能。通过SSH隧道连接线上数据库,支持自然语言转SQL、执行查询、表结构探索。当用户需要查询数据库、问数据、看表结构时使用此技能。 +--- + +# Smart Query - 智能问数 + +通过 SSH 隧道安全连接线上数据库,支持自然语言查询和 SQL 执行。 + +## 触发场景 + +- 用户问"查一下xxx数据"、"帮我看看xxx表" +- 用户需要查询线上数据库 +- 用户问"有哪些表"、"表结构是什么" + +## 快速使用 + +### 1. 测试连接 + +```bash +python .opencode/skills/smart-query/scripts/db_connector.py +``` + +### 2. 执行SQL查询 + +```bash +python .opencode/skills/smart-query/scripts/query.py "SELECT * FROM table_name LIMIT 10" +python .opencode/skills/smart-query/scripts/query.py "SHOW TABLES" +python .opencode/skills/smart-query/scripts/query.py "DESC table_name" +``` + +参数: +- `-n 50`:限制返回行数 +- `-f json`:JSON格式输出 +- `--raw`:输出原始结果(含元信息) + +### 3. 生成表结构文档 + +```bash +python .opencode/skills/smart-query/scripts/schema_loader.py +``` + +生成 `references/schema.md`,包含所有表结构信息。 + +## 自然语言查询流程 + +1. **理解用户意图**:分析用户想查什么数据 +2. **查阅表结构**:读取 `references/schema.md` 了解表结构 +3. **生成SQL**:根据表结构编写正确的SQL +4. **执行查询**:使用 `query.py` 执行 +5. **解读结果**:用通俗语言解释查询结果 + +## 配置说明 + +配置文件:`config/settings.json` + +```json +{ + "ssh": { + "host": "SSH跳板机地址", + "port": 22, + "username": "用户名", + "password": "密码", + "key_file": null + }, + "database": { + "type": "mysql", + "host": "数据库内网地址", + "port": 3306, + "database": "库名", + "username": "数据库用户", + "password": "数据库密码" + } +} +``` + +## 分享给同事 + +1. 复制整个 `smart-query/` 目录 +2. 同事复制 `config/settings.json.example` 为 `settings.json` +3. 填入自己的 SSH 和数据库连接信息 +4. 安装依赖:`pip install paramiko sshtunnel pymysql` + +## 安全提示 + +- `config/settings.json` 包含敏感信息,**不要提交到 Git** +- 建议将 `config/settings.json` 加入 `.gitignore` +- 只执行 SELECT 查询,避免 UPDATE/DELETE 操作 + +## 依赖安装 + +```bash +pip install paramiko sshtunnel pymysql +``` + +## 脚本清单 + +| 脚本 | 用途 | +|------|------| +| `scripts/db_connector.py` | SSH隧道+数据库连接,可单独运行测试连接 | +| `scripts/query.py` | 执行SQL查询,支持表格/JSON输出 | +| `scripts/schema_loader.py` | 加载表结构,生成 schema.md | + +## 参考文档 + +| 文档 | 说明 | +|------|------| +| `references/schema.md` | 数据库表结构(运行 schema_loader.py 生成) | diff --git a/smart-query/assets/.gitkeep b/smart-query/assets/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/smart-query/config/settings.json b/smart-query/config/settings.json new file mode 100644 index 0000000..3e08ba3 --- /dev/null +++ b/smart-query/config/settings.json @@ -0,0 +1,21 @@ +{ + "ssh": { + "host": "", + "port": 22, + "username": "", + "password": "", + "key_file": null + }, + "database": { + "type": "mysql", + "host": "127.0.0.1", + "port": 3306, + "database": "your_database", + "username": "your_db_user", + "password": "your_db_password" + }, + "query": { + "max_rows": 100, + "timeout": 30 + } +} diff --git a/smart-query/config/settings.json.example b/smart-query/config/settings.json.example new file mode 100644 index 0000000..8ab0a94 --- /dev/null +++ b/smart-query/config/settings.json.example @@ -0,0 +1,21 @@ +{ + "ssh": { + "host": "your-ssh-host.example.com", + "port": 22, + "username": "your-username", + "password": "your-password", + "key_file": null + }, + "database": { + "type": "mysql", + "host": "127.0.0.1", + "port": 3306, + "database": "your_database", + "username": "your_db_user", + "password": "your_db_password" + }, + "query": { + "max_rows": 100, + "timeout": 30 + } +} diff --git a/smart-query/references/schema.md b/smart-query/references/schema.md new file mode 100644 index 0000000..694b2c5 --- /dev/null +++ b/smart-query/references/schema.md @@ -0,0 +1,3 @@ +# 表结构文档 + +运行 `python scripts/schema_loader.py` 生成此文件。 diff --git a/smart-query/scripts/db_connector.py b/smart-query/scripts/db_connector.py new file mode 100644 index 0000000..7d1bab7 --- /dev/null +++ b/smart-query/scripts/db_connector.py @@ -0,0 +1,124 @@ +#!/usr/bin/env python3 +"""数据库连接器 - 支持直连和SSH隧道两种模式""" + +import json +import sys +from pathlib import Path +from contextlib import contextmanager + +try: + import pymysql +except ImportError as e: + print(f"缺少依赖: {e}") + print("请运行: pip install pymysql") + sys.exit(1) + + +def load_config(): + """加载配置文件""" + config_path = Path(__file__).parent.parent / "config" / "settings.json" + if not config_path.exists(): + print(f"配置文件不存在: {config_path}") + print("请复制 settings.json.example 为 settings.json 并填写配置") + sys.exit(1) + + with open(config_path, "r", encoding="utf-8") as f: + return json.load(f) + + +@contextmanager +def get_db_connection(): + """获取数据库连接(自动判断直连或SSH隧道)""" + config = load_config() + ssh_config = config.get("ssh") + db_config = config["database"] + + use_ssh = ssh_config and ssh_config.get("host") + + if use_ssh: + try: + from sshtunnel import SSHTunnelForwarder + import paramiko + except ImportError: + print("SSH隧道需要额外依赖: pip install paramiko sshtunnel") + sys.exit(1) + + tunnel = SSHTunnelForwarder( + (ssh_config["host"], ssh_config["port"]), + ssh_username=ssh_config["username"], + ssh_password=ssh_config.get("password"), + ssh_pkey=ssh_config.get("key_file"), + remote_bind_address=(db_config["host"], db_config["port"]), + local_bind_address=("127.0.0.1",), + ) + + try: + tunnel.start() + + connection = pymysql.connect( + host="127.0.0.1", + port=tunnel.local_bind_port, + user=db_config["username"], + password=db_config["password"], + database=db_config["database"], + charset="utf8mb4", + cursorclass=pymysql.cursors.DictCursor, + connect_timeout=config["query"]["timeout"], + ) + + try: + yield connection + finally: + connection.close() + finally: + tunnel.stop() + else: + connection = pymysql.connect( + host=db_config["host"], + port=db_config["port"], + user=db_config["username"], + password=db_config["password"], + database=db_config["database"], + charset="utf8mb4", + cursorclass=pymysql.cursors.DictCursor, + connect_timeout=config["query"]["timeout"], + ) + + try: + yield connection + finally: + connection.close() + + +def test_connection(): + """测试数据库连接""" + config = load_config() + use_ssh = config.get("ssh") and config["ssh"].get("host") + + if use_ssh: + print("正在建立SSH隧道...") + else: + print("正在直连数据库...") + + try: + with get_db_connection() as conn: + print("数据库连接成功!") + with conn.cursor() as cursor: + cursor.execute("SELECT 1 as test") + result = cursor.fetchone() + print(f"测试查询结果: {result}") + + cursor.execute("SHOW TABLES") + tables = cursor.fetchall() + print(f"\n数据库中共有 {len(tables)} 张表:") + for t in tables: + table_name = list(t.values())[0] + print(f" - {table_name}") + return True + except Exception as e: + print(f"连接失败: {e}") + return False + + +if __name__ == "__main__": + test_connection() diff --git a/smart-query/scripts/query.py b/smart-query/scripts/query.py new file mode 100644 index 0000000..6247e69 --- /dev/null +++ b/smart-query/scripts/query.py @@ -0,0 +1,107 @@ +#!/usr/bin/env python3 +"""智能查询主脚本 - 执行SQL并返回结果""" + +import argparse +import json +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).parent)) +from db_connector import get_db_connection, load_config + + +def execute_query(sql: str, max_rows: int = None) -> dict: + """执行SQL查询""" + config = load_config() + if max_rows is None: + max_rows = config["query"]["max_rows"] + + result = { + "success": False, + "sql": sql, + "data": [], + "row_count": 0, + "columns": [], + "message": "" + } + + try: + with get_db_connection() as conn: + with conn.cursor() as cursor: + cursor.execute(sql) + + if sql.strip().upper().startswith("SELECT") or sql.strip().upper().startswith("SHOW") or sql.strip().upper().startswith("DESC"): + rows = cursor.fetchmany(max_rows) + result["data"] = rows + result["row_count"] = len(rows) + if rows: + result["columns"] = list(rows[0].keys()) + + total = cursor.fetchall() + if total: + result["message"] = f"返回 {result['row_count']} 行(共 {result['row_count'] + len(total)} 行,已截断)" + else: + result["message"] = f"返回 {result['row_count']} 行" + else: + conn.commit() + result["row_count"] = cursor.rowcount + result["message"] = f"影响 {cursor.rowcount} 行" + + result["success"] = True + + except Exception as e: + result["message"] = f"查询失败: {str(e)}" + + return result + + +def format_result(result: dict, output_format: str = "table") -> str: + """格式化查询结果""" + if not result["success"]: + return f"错误: {result['message']}" + + if not result["data"]: + return result["message"] + + if output_format == "json": + return json.dumps(result["data"], ensure_ascii=False, indent=2, default=str) + + columns = result["columns"] + rows = result["data"] + + col_widths = {col: len(str(col)) for col in columns} + for row in rows: + for col in columns: + col_widths[col] = max(col_widths[col], len(str(row.get(col, "")))) + + header = " | ".join(str(col).ljust(col_widths[col]) for col in columns) + separator = "-+-".join("-" * col_widths[col] for col in columns) + + lines = [header, separator] + for row in rows: + line = " | ".join(str(row.get(col, "")).ljust(col_widths[col]) for col in columns) + lines.append(line) + + lines.append(f"\n{result['message']}") + return "\n".join(lines) + + +def main(): + parser = argparse.ArgumentParser(description="智能数据库查询") + parser.add_argument("sql", help="要执行的SQL语句") + parser.add_argument("-n", "--max-rows", type=int, help="最大返回行数") + parser.add_argument("-f", "--format", choices=["table", "json"], default="table", help="输出格式") + parser.add_argument("--raw", action="store_true", help="输出原始JSON结果") + + args = parser.parse_args() + + result = execute_query(args.sql, args.max_rows) + + if args.raw: + print(json.dumps(result, ensure_ascii=False, indent=2, default=str)) + else: + print(format_result(result, args.format)) + + +if __name__ == "__main__": + main() diff --git a/smart-query/scripts/schema_loader.py b/smart-query/scripts/schema_loader.py new file mode 100644 index 0000000..30bc86b --- /dev/null +++ b/smart-query/scripts/schema_loader.py @@ -0,0 +1,111 @@ +#!/usr/bin/env python3 +"""数据库表结构加载器 - 生成表结构文档""" + +import sys +from pathlib import Path +from datetime import datetime + +sys.path.insert(0, str(Path(__file__).parent)) +from db_connector import get_db_connection + + +def get_table_schema(cursor, table_name: str) -> dict: + """获取单张表的结构信息""" + cursor.execute(f"DESCRIBE `{table_name}`") + columns = cursor.fetchall() + + cursor.execute(f"SHOW CREATE TABLE `{table_name}`") + create_sql = cursor.fetchone() + + try: + cursor.execute(f"SELECT COUNT(*) as cnt FROM `{table_name}`") + row_count = cursor.fetchone()["cnt"] + except: + row_count = "未知" + + return { + "name": table_name, + "columns": columns, + "create_sql": create_sql.get("Create Table", ""), + "row_count": row_count + } + + +def generate_schema_markdown(tables: list) -> str: + """生成Markdown格式的表结构文档""" + lines = [ + "# 数据库表结构", + "", + f"> 自动生成于 {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}", + "", + "## 表清单", + "", + "| 表名 | 行数 | 说明 |", + "|------|------|------|", + ] + + for t in tables: + lines.append(f"| `{t['name']}` | {t['row_count']} | |") + + lines.extend(["", "---", ""]) + + for t in tables: + lines.extend([ + f"## {t['name']}", + "", + f"行数: {t['row_count']}", + "", + "### 字段", + "", + "| 字段名 | 类型 | 可空 | 键 | 默认值 | 备注 |", + "|--------|------|------|-----|--------|------|", + ]) + + for col in t["columns"]: + field = col.get("Field", "") + col_type = col.get("Type", "") + null = col.get("Null", "") + key = col.get("Key", "") + default = col.get("Default", "") or "" + extra = col.get("Extra", "") + lines.append(f"| `{field}` | {col_type} | {null} | {key} | {default} | {extra} |") + + lines.extend(["", "---", ""]) + + return "\n".join(lines) + + +def main(): + print("正在连接数据库...") + + try: + with get_db_connection() as conn: + with conn.cursor() as cursor: + cursor.execute("SHOW TABLES") + table_list = cursor.fetchall() + + tables = [] + for t in table_list: + table_name = list(t.values())[0] + print(f" 加载表结构: {table_name}") + schema = get_table_schema(cursor, table_name) + tables.append(schema) + + markdown = generate_schema_markdown(tables) + + output_path = Path(__file__).parent.parent / "references" / "schema.md" + output_path.parent.mkdir(parents=True, exist_ok=True) + + with open(output_path, "w", encoding="utf-8") as f: + f.write(markdown) + + print(f"\n表结构文档已生成: {output_path}") + print(f"共 {len(tables)} 张表") + + except Exception as e: + print(f"加载失败: {e}") + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/story-to-scenes/SKILL.md b/story-to-scenes/SKILL.md new file mode 100644 index 0000000..44afa54 --- /dev/null +++ b/story-to-scenes/SKILL.md @@ -0,0 +1,317 @@ +--- +name: story-to-scenes +description: 长文本拆镜批量生图引擎。将故事、课程、连环画脚本智能拆分场景,批量生成风格统一、角色一致的配图。当用户提到「拆镜生图」「故事配图」「批量场景图」「连环画生成」「绘本生成」时使用此技能。 +--- + +# Story To Scenes + +长文本拆镜批量生图引擎,用于将故事、教学课程、连环画脚本等长文本智能拆分成场景,并批量生成风格统一、角色一致的配图。 + +## 核心流程 + +``` +输入长文本 → 角色提取 → 生成角色胚子图 → 确认锁定 + ↓ + 智能拆镜 → 场景清单 → 确认调整 + ↓ + 风格胚子图(第一张场景)→ 确认锁定 + ↓ + 场景胚子图(复用场景,可选)→ 确认锁定 + ↓ + 批量生成场景图(引用胚子)→ 输出图集 +``` + +## 铁律 + +1. **单图原则**:每个场景/角色生成独立单图,禁止多格拼接、分镜框、边框组合 +2. **先人后景**:必须先生成并锁定角色胚子,再进行场景生图 +3. **确认才锁定**:角色胚子、风格胚子必须用户确认后才算锁定 +4. **引用生成**:场景中出现已锁定角色时,必须引用其胚子图 +5. **提示词记录**:每张图的完整提示词必须记录,方便复用和微调 +6. **进度持久化**:生成过程实时保存进度,支持断点续传 + +## 详细步骤 + +### Step 1: 项目初始化 + +收集项目基本信息: + +```yaml +project_name: "" # 项目名称(必填) +style_preset: "" # 预设风格或自定义描述 +aspect_ratio: "3:4" # 尺寸:3:4 / 16:9 / 1:1 +source_text: "" # 原文内容或文件路径 +``` + +创建项目目录: +``` +assets/generated/{项目名}/ +├── characters/ # 角色胚子图 +├── locations/ # 场景胚子图 +├── scenes/ # 场景配图 +├── characters.md # 角色索引 +├── progress.json # 生成进度 +└── gallery.md # 完整图集索引 +``` + +### Step 2: 文本解析与角色提取 + +1. 自动识别文本类型:故事/课程/脚本/连环画 +2. 按语义分割场景(非机械按段落切) +3. 提取所有人物/动物/生物,生成角色清单表 + +输出角色清单表格式: + +| 角色名 | 类型 | 外貌特征 | 性格标签 | 出场场景 | +|--------|------|----------|----------|----------| +| 示例 | 人物/动物 | 详细外貌描述 | 性格关键词 | 1,2,3 | + +**交互点**:展示清单,让用户补充/修正角色描述,确认后进入下一步。 + +### Step 3: 生成角色胚子图 + +为每个角色生成标准立绘: + +- **构图要求**:正面或四分之三侧面,干净纯色背景 +- **画面要求**:单图,突出角色本体特征,禁止多角度拼接 +- **命名规范**:`{项目名}_char_{角色名}.png` +- **存储位置**:`assets/generated/{项目名}/characters/` + +生图提示词结构: +``` +[风格关键词], single character portrait of [角色描述], +[姿态], clean solid [背景色] background, +full body shot, character design sheet style, +--no multiple views, turnaround, collage, grid, panels, border +``` + +**交互点**:逐个展示角色胚子图,用户确认"OK"后锁定,不满意则重新生成。 + +将确认的角色信息写入 `characters.md`: +```markdown +# 角色索引 + +## {角色名} +- **胚子图**:characters/{角色名}.png +- **外貌描述**:{详细外貌} +- **出场场景**:{场景序号列表} +``` + +### Step 4: 智能拆镜 + +根据文本语义划分场景,生成场景清单表: + +| 序号 | 场景名称 | 画面描述 | 出场角色 | 镜头类型 | 情绪氛围 | +|------|----------|----------|----------|----------|----------| +| 01 | 场景名 | 具体画面内容 | 角色列表 | 远景/中景/特写 | 情绪关键词 | + +**镜头类型说明**: +- **远景**:交代环境,角色较小 +- **中景**:角色半身或全身,主体突出 +- **特写**:面部表情或关键物品细节 + +**交互点**:展示拆镜表,让用户调整场景划分、镜头选择,确认后进入下一步。 + +### Step 5: 生成风格胚子图 + +用第一个场景生成**风格定调图**: + +1. 根据用户选择的风格预设(或自定义描述)构建提示词 +2. 生成第一张场景图 +3. 展示给用户确认 + +**交互点**: +- 确认OK → 提取风格关键词,记录到 `progress.json`,全程复用 +- 不满意 → 调整风格描述,重新生成 + +### Step 6: 场景胚子图(可选) + +识别文本中**反复出现的重要场景**,如: +- 主角的家 +- 重要地标建筑 +- 反复出现的场所 + +为这些场景单独生成环境图: +- **构图要求**:无人物,纯场景环境 +- **存储位置**:`assets/generated/{项目名}/locations/` +- **命名规范**:`{场景名}.png` + +**交互点**:展示场景胚子图,确认或跳过。 + +### Step 7: 批量生成场景图 + +逐场景生成配图,采用**图生图**方式保证角色一致性。 + +#### 单角色场景 + +直接基于该角色胚子图做图生图: +```bash +python image_to_image.py "characters/角色A.png" "场景描述,保持角色形象..." -o "scenes/scene_xx.png" +``` + +#### 多角色场景(串行替换规则) + +当场景包含多个角色时,必须**串行轮流替换**,逐步锁定每个角色: + +``` +步骤1:基于角色A胚子图 + 场景描述 → 生成 role1.png(角色A锁定,其他角色可能不一致) +步骤2:基于角色B胚子图 + role1.png → 生成 role2.png(角色A+B锁定) +步骤3:基于角色C胚子图 + role2.png → 生成 role3.png(角色A+B+C锁定) +...依此类推,直到所有角色都替换完成 +最终输出:role{n}.png 作为该场景的最终图 +``` + +**执行示例**(3角色场景): +```bash +# 第1轮:锁定孙悟空 +python image_to_image.py "characters/孙悟空.png" "场景描述,孙悟空xxx,唐僧xxx,猪八戒xxx..." -o "scenes/scene_xx_role1.png" + +# 第2轮:基于role1锁定唐僧 +python image_to_image.py "characters/唐僧.png" "保持场景和其他角色,替换唐僧形象与参考图一致" --ref "scenes/scene_xx_role1.png" -o "scenes/scene_xx_role2.png" + +# 第3轮:基于role2锁定猪八戒 +python image_to_image.py "characters/猪八戒.png" "保持场景和其他角色,替换猪八戒形象与参考图一致" --ref "scenes/scene_xx_role2.png" -o "scenes/scene_xx_role3.png" + +# 最终重命名 +mv "scenes/scene_xx_role3.png" "scenes/scene_xx_场景名.png" +# 清理中间文件 +rm "scenes/scene_xx_role1.png" "scenes/scene_xx_role2.png" +``` + +**角色替换顺序**:按重要性或画面占比从大到小排序 + +#### 提示词规范 + +提示词结构: +``` +[风格关键词], [场景描述], +[角色A描述], [角色B描述], +[镜头构图], [情绪氛围], +--no multiple panels, comic layout, grid, collage, split frame, border, manga panels, text, caption, title, subtitle, watermark, signature, letters, words, writing +``` + +**铁律**: +- 禁止输出任何文字、标题、水印、签名 +- 排除词必须包含 `text, caption, title, subtitle, watermark, signature, letters, words, writing` + +**命名规范**:`scene_{序号}_{场景名}.png` +**存储位置**:`assets/generated/{项目名}/scenes/` + +**进度追踪**: +- 每生成一张,更新 `progress.json` +- 失败自动重试(最多3次) + +### Step 8: 输出整理 + +生成完成后,创建图集索引文档 `gallery.md`: + +```markdown +# {项目名} 场景图集 + +## 项目信息 +- **风格**:{风格描述} +- **尺寸**:{尺寸} +- **场景数**:{总数} +- **生成日期**:{日期} + +## 角色一览 +| 角色 | 胚子图 | +|------|--------| +| {角色名} | ![[characters/{角色名}.png]] | + +## 场景图集 + +### Scene 01:{场景名} +![[scenes/scene_01_{场景名}.png]] +> {场景描述} + +
+提示词 +{完整提示词} +
+``` + +## 特殊操作命令 + +| 命令 | 说明 | +|------|------| +| `重新生成 {角色名}` | 重新生成指定角色胚子图 | +| `重新生成 scene_{序号}` | 重新生成指定场景图 | +| `从 scene_{序号} 继续` | 断点续传,从指定场景继续 | +| `更换风格` | 重新选择风格,需重新生成风格胚子 | +| `导出图集` | 生成最终索引文档 | + +## 预设风格 + +可选择以下预设风格,或提供自定义风格描述: + +### 日系治愈绘本 +``` +soft watercolor illustration, warm pastel colors, gentle lighting, +Studio Ghibli inspired, dreamy atmosphere, delicate linework +``` + +### 国风水墨淡彩 +``` +traditional Chinese ink wash painting, subtle watercolor tints, +elegant brushwork, Song dynasty aesthetic, zen atmosphere +``` + +### 欧美儿童插画 +``` +vibrant children's book illustration, bold colors, expressive characters, +playful style, Pixar-inspired, warm and inviting +``` + +### 赛博朋克 +``` +cyberpunk aesthetic, neon lights, dark atmosphere, +high contrast, futuristic cityscape, Blade Runner inspired +``` + +### 扁平矢量风 +``` +flat vector illustration, clean geometric shapes, +modern minimalist, limited color palette, graphic design style +``` + +### 水彩手绘风 +``` +traditional watercolor painting, visible brush strokes, +organic textures, artistic imperfections, soft edges +``` + +## 文件结构 + +生成项目的完整目录结构: + +``` +assets/generated/{项目名}/ +├── characters/ # 角色胚子图 +│ ├── {角色A}.png +│ └── {角色B}.png +├── locations/ # 场景胚子图(可选) +│ └── {场景名}.png +├── scenes/ # 场景配图 +│ ├── scene_01_{场景名}.png +│ ├── scene_02_{场景名}.png +│ └── ... +├── characters.md # 角色索引表 +├── progress.json # 生成进度 +└── gallery.md # 完整图集索引 +``` + +## 依赖技能 + +- `image-service`:实际生图执行 +- `obsidian-markdown`:生成索引文档(可选) + +## References + +- `references/prompt_templates.md`:提示词模板库 +- `references/style_presets.md`:风格预设详情 + +## Assets + +- `assets/templates/gallery_template.md`:图集索引模板 +- `assets/templates/characters_template.md`:角色索引模板 diff --git a/story-to-scenes/assets/.gitkeep b/story-to-scenes/assets/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/story-to-scenes/assets/templates/characters_template.md b/story-to-scenes/assets/templates/characters_template.md new file mode 100644 index 0000000..b6dafbe --- /dev/null +++ b/story-to-scenes/assets/templates/characters_template.md @@ -0,0 +1,53 @@ +# {{PROJECT_NAME}} 角色索引 + +## 项目信息 + +- **项目名称**:{{PROJECT_NAME}} +- **角色总数**:{{CHARACTER_COUNT}} +- **创建日期**:{{DATE}} + +--- + +{{#CHARACTERS}} +## {{NAME}} + +![[characters/{{NAME}}.png|300]] + +### 基本信息 + +| 属性 | 描述 | +|------|------| +| **类型** | {{TYPE}} | +| **外貌特征** | {{APPEARANCE}} | +| **性格标签** | {{PERSONALITY}} | +| **出场场景** | {{SCENES}} | + +### 详细描述 + +{{FULL_DESCRIPTION}} + +### 生成提示词 + +``` +{{PROMPT}} +``` + +### 状态 + +- [x] 胚子图已生成 +- [x] 用户已确认 +- **锁定时间**:{{LOCKED_TIME}} + +--- + +{{/CHARACTERS}} + +## 角色关系图(可选) + +``` +待补充角色关系描述 +``` + +--- + +*由 story-to-scenes 技能自动生成* diff --git a/story-to-scenes/assets/templates/gallery_template.md b/story-to-scenes/assets/templates/gallery_template.md new file mode 100644 index 0000000..3e223ba --- /dev/null +++ b/story-to-scenes/assets/templates/gallery_template.md @@ -0,0 +1,60 @@ +# {{PROJECT_NAME}} 场景图集 + +## 项目信息 + +| 属性 | 值 | +|------|------| +| **项目名称** | {{PROJECT_NAME}} | +| **风格** | {{STYLE_PRESET}} | +| **尺寸** | {{ASPECT_RATIO}} | +| **场景数** | {{SCENE_COUNT}} | +| **生成日期** | {{DATE}} | + +--- + +## 角色一览 + +| 角色 | 胚子图 | 描述 | +|------|--------|------| +{{#CHARACTERS}} +| {{NAME}} | ![[characters/{{NAME}}.png\|200]] | {{DESCRIPTION}} | +{{/CHARACTERS}} + +--- + +## 场景图集 + +{{#SCENES}} +### Scene {{INDEX}}:{{SCENE_NAME}} + +![[scenes/scene_{{INDEX}}_{{SCENE_NAME}}.png]] + +> {{SCENE_DESCRIPTION}} + +**出场角色**:{{CHARACTERS}} +**镜头类型**:{{SHOT_TYPE}} +**情绪氛围**:{{MOOD}} + +
+生成提示词 + +``` +{{PROMPT}} +``` + +
+ +--- + +{{/SCENES}} + +## 生成记录 + +- **开始时间**:{{START_TIME}} +- **完成时间**:{{END_TIME}} +- **总耗时**:{{DURATION}} +- **重试次数**:{{RETRY_COUNT}} + +--- + +*由 story-to-scenes 技能自动生成* diff --git a/story-to-scenes/assets/templates/progress_template.json b/story-to-scenes/assets/templates/progress_template.json new file mode 100644 index 0000000..ae396a7 --- /dev/null +++ b/story-to-scenes/assets/templates/progress_template.json @@ -0,0 +1,38 @@ +{ + "project_name": "{{PROJECT_NAME}}", + "created_at": "{{DATE}}", + "updated_at": "{{DATE}}", + "config": { + "style_preset": "{{STYLE_PRESET}}", + "aspect_ratio": "{{ASPECT_RATIO}}", + "source_file": "{{SOURCE_FILE}}" + }, + "status": "in_progress", + "phase": "init", + "characters": { + "total": 0, + "completed": 0, + "locked": [], + "pending": [], + "items": [] + }, + "locations": { + "total": 0, + "completed": 0, + "locked": [], + "items": [] + }, + "scenes": { + "total": 0, + "completed": 0, + "current": 0, + "failed": [], + "items": [] + }, + "style": { + "locked": false, + "keywords": "", + "reference_image": "" + }, + "logs": [] +} diff --git a/story-to-scenes/references/prompt_templates.md b/story-to-scenes/references/prompt_templates.md new file mode 100644 index 0000000..3dcdab6 --- /dev/null +++ b/story-to-scenes/references/prompt_templates.md @@ -0,0 +1,157 @@ +# 提示词模板库 + +本文档包含 story-to-scenes 技能使用的标准提示词模板。 + +## 角色胚子图模板 + +### 人物角色 +``` +[风格关键词], single character portrait of [角色名], +[性别] [年龄段] [种族/民族], +[发型发色], [眼睛描述], [服装描述], [配饰描述], +[表情/姿态], standing pose, +clean solid [背景色] background, +full body shot, character design reference, +high quality, detailed, +--no multiple views, turnaround, collage, grid, panels, border, frame, split image +``` + +### 动物角色 +``` +[风格关键词], single character portrait of [动物名], +[动物种类], [体型大小], [毛色/皮肤颜色], +[特殊特征如斑纹、角、翅膀等], +[拟人化服装/配饰(如有)], +[表情/姿态], +clean solid [背景色] background, +full body shot, character design reference, +high quality, detailed, +--no multiple views, turnaround, collage, grid, panels, border, frame, split image +``` + +### 幻想生物 +``` +[风格关键词], single character portrait of [生物名], +[生物类型], [体型], [颜色], +[独特特征描述], +[魔法元素/光效(如有)], +[表情/姿态], +clean solid [背景色] background, +full body shot, creature design reference, +high quality, detailed, +--no multiple views, turnaround, collage, grid, panels, border, frame, split image +``` + +## 场景胚子图模板 + +### 室内场景 +``` +[风格关键词], interior scene of [场景名], +[房间类型], [建筑风格], +[主要家具/物品], [装饰细节], +[光线条件], [时间氛围], +[情绪氛围], empty scene without characters, +wide shot, establishing shot, +high quality, detailed environment, +--no people, characters, figures, panels, border, frame, collage +``` + +### 室外场景 +``` +[风格关键词], exterior scene of [场景名], +[地点类型], [自然/城市环境], +[主要地标/元素], [植被/建筑], +[天气条件], [时间(日/夜)], +[情绪氛围], empty scene without characters, +wide shot, establishing shot, +high quality, detailed environment, +--no people, characters, figures, panels, border, frame, collage +``` + +## 故事场景图模板 + +### 远景 +``` +[风格关键词], wide establishing shot, +[场景环境描述], +[角色A描述] and [角色B描述] in the distance, +[角色动作/位置关系], +[光线条件], [天气/时间], +[情绪氛围], +cinematic composition, environmental storytelling, +--no multiple panels, comic layout, grid, collage, split frame, border, manga panels +``` + +### 中景 +``` +[风格关键词], medium shot, +[场景环境简述], +[角色A描述] [动作/表情], +[角色B描述] [动作/表情], +[角色互动/位置关系], +[光线条件], [情绪氛围], +balanced composition, narrative scene, +--no multiple panels, comic layout, grid, collage, split frame, border, manga panels +``` + +### 特写 +``` +[风格关键词], close-up shot, +[角色描述] [表情细节], +[关键物品/细节(如有)], +[背景虚化/简化处理], +[光线条件], [情绪氛围], +emotional focus, intimate framing, +--no multiple panels, comic layout, grid, collage, split frame, border, manga panels +``` + +## 排除词标准集 + +### 反多格拼接 +``` +--no multiple panels, comic layout, grid, collage, split frame, border, manga panels, +comic book style, sequential art, storyboard, multi-panel, divided image, frames +``` + +### 反多视角 +``` +--no multiple views, turnaround, front and back, side view combination, +character sheet with poses, reference sheet, model sheet +``` + +### 反人物(用于纯场景) +``` +--no people, characters, figures, humans, animals, creatures, silhouettes +``` + +## 风格关键词组合 + +### 温馨治愈系 +``` +soft lighting, warm color palette, gentle atmosphere, +cozy feeling, heartwarming scene, peaceful mood +``` + +### 紧张悬疑系 +``` +dramatic lighting, high contrast, tense atmosphere, +mysterious shadows, suspenseful mood, cinematic tension +``` + +### 欢快活泼系 +``` +bright colors, dynamic composition, joyful atmosphere, +energetic mood, playful scene, vibrant lighting +``` + +### 忧伤抒情系 +``` +muted colors, soft focus, melancholic atmosphere, +gentle rain or mist, contemplative mood, emotional depth +``` + +### 史诗宏大系 +``` +epic scale, dramatic sky, grand composition, +majestic atmosphere, awe-inspiring, cinematic scope +``` diff --git a/story-to-scenes/references/style_presets.md b/story-to-scenes/references/style_presets.md new file mode 100644 index 0000000..87b46e8 --- /dev/null +++ b/story-to-scenes/references/style_presets.md @@ -0,0 +1,235 @@ +# 风格预设详情 + +本文档包含 story-to-scenes 技能支持的风格预设完整描述。 + +## 日系治愈绘本 + +**适用场景**:儿童故事、治愈系绘本、温馨日常 + +**核心关键词**: +``` +soft watercolor illustration, warm pastel colors, gentle lighting, +Studio Ghibli inspired, dreamy atmosphere, delicate linework, +hand-painted texture, nostalgic feeling, cozy atmosphere +``` + +**色调特点**: +- 主色调:暖黄、淡粉、天蓝、草绿 +- 饱和度:中低 +- 对比度:柔和 + +**参考风格**:宫崎骏动画、绘本插画家 iwasaki chihiro + +--- + +## 国风水墨淡彩 + +**适用场景**:中国传统故事、古风题材、诗词配图 + +**核心关键词**: +``` +traditional Chinese ink wash painting, subtle watercolor tints, +elegant brushwork, Song dynasty aesthetic, zen atmosphere, +xieyi style, flowing ink, bamboo paper texture, +oriental landscape, classical Chinese art +``` + +**色调特点**: +- 主色调:墨黑、宣纸白、淡青、赭石 +- 饱和度:低 +- 对比度:通过墨色浓淡体现 + +**参考风格**:齐白石、张大千、古代工笔画 + +--- + +## 欧美儿童插画 + +**适用场景**:欧美风格童话、教育类绘本、角色动画 + +**核心关键词**: +``` +vibrant children's book illustration, bold colors, expressive characters, +playful style, Pixar-inspired, warm and inviting, +rounded shapes, friendly characters, storybook illustration, +rich textures, imaginative scene +``` + +**色调特点**: +- 主色调:明亮的原色、糖果色 +- 饱和度:中高 +- 对比度:清晰但不刺眼 + +**参考风格**:皮克斯、迪士尼、Mary Blair + +--- + +## 赛博朋克 + +**适用场景**:科幻故事、未来都市、反乌托邦题材 + +**核心关键词**: +``` +cyberpunk aesthetic, neon lights, dark atmosphere, +high contrast, futuristic cityscape, Blade Runner inspired, +holographic displays, rain-slicked streets, megacity, +tech noir, dystopian future, glowing signs +``` + +**色调特点**: +- 主色调:霓虹粉、电光蓝、毒药绿、深黑 +- 饱和度:高(霓虹区域)vs 低(暗部) +- 对比度:极高 + +**参考风格**:银翼杀手、攻壳机动队、赛博朋克2077 + +--- + +## 扁平矢量风 + +**适用场景**:商业插画、信息图、现代简约风格 + +**核心关键词**: +``` +flat vector illustration, clean geometric shapes, +modern minimalist, limited color palette, graphic design style, +bold outlines, simple shapes, contemporary illustration, +digital art, clean edges, stylized +``` + +**色调特点**: +- 主色调:根据项目定制,通常3-5色 +- 饱和度:中 +- 对比度:清晰的色块分割 + +**参考风格**:Airbnb插画、Slack插画、现代UI设计 + +--- + +## 水彩手绘风 + +**适用场景**:艺术感强的绘本、诗意场景、自然题材 + +**核心关键词**: +``` +traditional watercolor painting, visible brush strokes, +organic textures, artistic imperfections, soft edges, +wet-on-wet technique, pigment blooms, natural flow, +handmade quality, painterly style +``` + +**色调特点**: +- 主色调:自然色系为主 +- 饱和度:随水彩浓淡变化 +- 对比度:柔和渐变 + +**参考风格**:传统水彩画家、自然插画 + +--- + +## 复古美漫风 + +**适用场景**:超级英雄故事、美式冒险、怀旧题材 + +**核心关键词**: +``` +vintage American comic book style, bold ink lines, +halftone dots, primary colors, retro illustration, +classic superhero comics, 1960s aesthetic, +dynamic poses, action lines, speech bubble ready +``` + +**色调特点**: +- 主色调:红、蓝、黄、黑 +- 饱和度:高 +- 对比度:强烈 + +**参考风格**:Jack Kirby、经典漫威/DC漫画 + +--- + +## 韩系唯美风 + +**适用场景**:浪漫故事、青春题材、都市情感 + +**核心关键词**: +``` +Korean webtoon style, soft gradients, romantic atmosphere, +beautiful character design, sparkling effects, +pastel backgrounds, emotional scenes, manhwa inspired, +lens flare, dreamy bokeh, aesthetic lighting +``` + +**色调特点**: +- 主色调:粉色系、天蓝、薰衣草紫 +- 饱和度:中 +- 对比度:柔和 + +**参考风格**:韩国Webtoon、少女漫画 + +--- + +## 暗黑哥特风 + +**适用场景**:恐怖故事、黑暗童话、悬疑题材 + +**核心关键词**: +``` +dark gothic illustration, moody atmosphere, +Victorian aesthetic, haunting beauty, shadowy scenes, +ornate details, macabre elements, candlelight, +fog and mist, dramatic shadows, eerie mood +``` + +**色调特点**: +- 主色调:黑、深紫、暗红、烛光黄 +- 饱和度:低 +- 对比度:高(光影对比) + +**参考风格**:Tim Burton、Edward Gorey + +--- + +## 像素复古风 + +**适用场景**:游戏相关故事、怀旧题材、8-bit美学 + +**核心关键词**: +``` +pixel art style, retro game aesthetic, 16-bit graphics, +limited color palette, nostalgic gaming, +crisp pixels, chiptune era, classic RPG style, +sprite-like characters, pixelated environment +``` + +**色调特点**: +- 主色调:根据复古游戏调色板 +- 饱和度:中 +- 对比度:清晰的像素边界 + +**参考风格**:经典任天堂、像素艺术家 eBoy + +--- + +## 自定义风格指南 + +如果预设风格不满足需求,可提供自定义风格描述: + +### 描述要素 + +1. **画风类型**:油画/水彩/数字绘画/素描等 +2. **色调倾向**:暖/冷/中性,饱和度高低 +3. **光影风格**:柔和/戏剧性/平面化 +4. **笔触特点**:细腻/粗犷/可见/隐藏 +5. **参考艺术家/作品**:具体的风格参照 +6. **情绪氛围**:整体想要传达的感觉 + +### 示例 + +``` +自定义风格:梵高星空风 +描述:post-impressionist style, swirling brushstrokes, +vibrant yellows and blues, expressive texture, +Van Gogh inspired, emotional intensity, visible paint strokes, +starry night aesthetic, dynamic movement +``` diff --git a/uni-agent/README.md b/uni-agent/README.md new file mode 100644 index 0000000..4dc72ae --- /dev/null +++ b/uni-agent/README.md @@ -0,0 +1,189 @@ +# UniAgent - 统一智能体协议适配层 + +"Connect Any Agent, Any Protocol" + +一套 API 调用所有 Agent 协议(ANP/MCP/A2A/AITP/LMOS/Agent Protocol)。 + +## 一键部署 + +```bash +# 1. 运行安装脚本 +./setup.sh + +# 2. 开始使用 +python scripts/uni_cli.py list +``` + +## 使用方式 + +### 调用 Agent + +```bash +# Agent ID 格式: @ + +# ANP - 去中心化 Agent 网络 +python scripts/uni_cli.py call amap@anp maps_weather '{"city":"北京"}' +python scripts/uni_cli.py call amap@anp maps_text_search '{"keywords":"咖啡厅","city":"上海"}' + +# MCP - LLM 工具调用 (需配置) +python scripts/uni_cli.py call filesystem@mcp read_file '{"path":"/tmp/a.txt"}' + +# A2A - Google Agent 协作 (需配置) +python scripts/uni_cli.py call assistant@a2a tasks/send '{"message":{"role":"user","content":"hello"}}' + +# AITP - NEAR 交互交易 (需配置) +python scripts/uni_cli.py call shop@aitp message '{"content":"我要买咖啡"}' + +# Agent Protocol - REST API (需配置) +python scripts/uni_cli.py call autogpt@ap create_task '{"input":"写一个hello world"}' + +# LMOS - 企业级 Agent (需配置) +python scripts/uni_cli.py call sales@lmos invoke '{"capability":"sales","input":{}}' +``` + +### 查看 Agent 方法 + +```bash +python scripts/uni_cli.py methods amap@anp +``` + +### 发现 Agent + +```bash +python scripts/uni_cli.py discover weather +``` + +### 列出已注册 Agent + +```bash +python scripts/uni_cli.py list +``` + +## 支持的协议 + +| 协议 | 状态 | 说明 | +|------|------|------| +| **ANP** | ✅ 已实现 | Agent Network Protocol - 去中心化身份 + Agent 网络 | +| **MCP** | ✅ 已实现 | Model Context Protocol - LLM 工具调用 | +| **A2A** | ✅ 已实现 | Agent-to-Agent - Google 的 Agent 间协作协议 | +| **AITP** | ✅ 已实现 | Agent Interaction & Transaction - 交互 + 交易 | +| **Agent Protocol** | ✅ 已实现 | AI Engineer Foundation REST API 标准 | +| **LMOS** | ✅ 已实现 | Language Model OS - Eclipse 企业级 Agent 平台 | + +## 内置 ANP Agent + +| ID | 名称 | 功能 | +|----|------|------| +| amap@anp | 高德地图 | 地点搜索、路线规划、天气查询 | +| kuaidi@anp | 快递查询 | 快递单号追踪 | +| hotel@anp | 酒店预订 | 搜索酒店、查询房价 | +| juhe@anp | 聚合查询 | 多种生活服务 | +| navigation@anp | Agent导航 | 发现更多 Agent | + +## 添加自定义 Agent + +编辑 `config/agents.yaml`: + +```yaml +agents: + # ANP Agent + - id: my_agent + protocol: anp + name: 我的 Agent + ad_url: https://example.com/ad.json + + # MCP Server + - id: filesystem + protocol: mcp + name: 文件系统 + command: npx + args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"] + + # A2A Agent + - id: assistant + protocol: a2a + name: AI Assistant + endpoint: https://example.com/.well-known/agent.json + auth: + type: api_key + api_key: "${A2A_API_KEY}" + + # AITP Agent + - id: shop + protocol: aitp + name: NEAR Shop + endpoint: https://shop.near.ai/api + wallet: + type: near + account_id: "${NEAR_ACCOUNT_ID}" + + # Agent Protocol + - id: autogpt + protocol: agent_protocol # 或 ap + name: AutoGPT + endpoint: http://localhost:8000 + + # LMOS Agent + - id: sales + protocol: lmos + name: 销售 Agent + endpoint: http://sales.internal:8080 +``` + +## 架构设计 + +``` +┌─────────────────────────────────────────────────────────┐ +│ UniAgent │ +│ 统一调用接口 │ +├─────────────────────────────────────────────────────────┤ +│ call(agent_id, method, params) -> result │ +└────────────────────────┬────────────────────────────────┘ + │ + ┌──────────┴──────────┐ + │ Protocol Router │ + └──────────┬──────────┘ + │ + ┌─────────┬───────────┼───────────┬─────────┬─────────┐ + ▼ ▼ ▼ ▼ ▼ ▼ +┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ +│ ANP │ │ MCP │ │ A2A │ │ AITP │ │ AP │ │ LMOS │ +└──────┘ └──────┘ └──────┘ └──────┘ └──────┘ └──────┘ +``` + +## 目录结构 + +``` +uni-agent/ +├── README.md +├── SKILL.md # AI 助手技能描述 +├── setup.sh # 一键安装 +├── requirements.txt +├── config/ +│ ├── agents.yaml # Agent 注册表 +│ └── .gitignore +├── adapters/ +│ ├── __init__.py # 适配器注册 +│ ├── base.py # 适配器基类 +│ ├── anp.py # ANP 适配器 +│ ├── mcp.py # MCP 适配器 +│ ├── a2a.py # A2A 适配器 +│ ├── aitp.py # AITP 适配器 +│ ├── agent_protocol.py # Agent Protocol 适配器 +│ └── lmos.py # LMOS 适配器 +└── scripts/ + └── uni_cli.py # CLI 工具 +``` + +## 扩展新协议 + +1. 创建 `adapters/new_protocol.py` +2. 继承 `ProtocolAdapter` 基类 +3. 实现 `connect`、`call`、`discover`、`close` 方法 +4. 在 `adapters/__init__.py` 注册 + +详见 [SKILL.md](SKILL.md) + +## License + +MIT diff --git a/uni-agent/SKILL.md b/uni-agent/SKILL.md new file mode 100644 index 0000000..6c51b5c --- /dev/null +++ b/uni-agent/SKILL.md @@ -0,0 +1,279 @@ +--- +name: uni-agent +description: 统一智能体协议适配层。一套 API 调用所有 Agent 协议(ANP/MCP/A2A/AITP 等)。当用户需要调用 Agent、跨协议通信、连接工具时触发此技能。 +--- + +# UniAgent - 统一智能体协议适配层 + +"Connect Any Agent, Any Protocol" + +## 设计理念 + +### 问题 +当前 Agent 协议生态割裂: +- **MCP**:Anthropic 的工具调用协议 +- **A2A**:Google 的 Agent 间协作协议 +- **ANP**:去中心化身份 + Agent 网络协议 +- **AITP**:NEAR 的交互交易协议 +- ... + +开发者需要为每个协议学习不同的 SDK、实现不同的调用逻辑。 + +### 解决方案 +UniAgent 提供统一抽象层,一套 API 适配所有协议: + +```python +from uni_agent import UniAgent + +agent = UniAgent() + +# 调用 ANP Agent +agent.call("amap@anp", "maps_weather", {"city": "北京"}) + +# 调用 MCP Server +agent.call("filesystem@mcp", "read_file", {"path": "/tmp/a.txt"}) + +# 调用 A2A Agent +agent.call("assistant@a2a", "chat", {"message": "hello"}) + +# 调用 AITP Agent(带支付) +agent.call("shop@aitp", "purchase", {"item": "coffee", "amount": 10}) +``` + +## 架构设计 + +``` +┌─────────────────────────────────────────────────────────┐ +│ UniAgent │ +│ 统一调用接口 │ +├─────────────────────────────────────────────────────────┤ +│ call(agent_id, method, params) -> result │ +│ discover(capability) -> List[Agent] │ +│ connect(agent_id) -> Connection │ +└────────────────────────┬────────────────────────────────┘ + │ + ┌──────────┴──────────┐ + │ Protocol Router │ + │ 协议路由 & 适配 │ + └──────────┬──────────┘ + │ + ┌─────────┬───────────┼───────────┬─────────┐ + ▼ ▼ ▼ ▼ ▼ +┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ +│ ANP │ │ MCP │ │ A2A │ │ AITP │ │ ... │ +│Adapter│ │Adapter│ │Adapter│ │Adapter│ │Adapter│ +└──────┘ └──────┘ └──────┘ └──────┘ └──────┘ +``` + +## 核心概念 + +### 1. Agent ID 格式 +``` +@ + +示例: +- amap@anp # ANP 协议的高德地图 Agent +- filesystem@mcp # MCP 协议的文件系统 Server +- gemini@a2a # A2A 协议的 Gemini Agent +- shop@aitp # AITP 协议的商店 Agent +``` + +### 2. 统一调用接口 +```python +result = agent.call( + agent_id="amap@anp", # Agent 标识 + method="maps_weather", # 方法名 + params={"city": "北京"}, # 参数 + timeout=30 # 可选超时 +) +``` + +### 3. 能力发现 +```python +# 发现所有能提供天气服务的 Agent +agents = agent.discover("weather") +# 返回: [ +# {"id": "amap@anp", "protocol": "anp", "methods": [...]}, +# {"id": "weather@mcp", "protocol": "mcp", "methods": [...]} +# ] +``` + +### 4. 协议适配器接口 +```python +class ProtocolAdapter(ABC): + """协议适配器基类""" + + @abstractmethod + def connect(self, agent_config: dict) -> Connection: + """建立连接""" + pass + + @abstractmethod + def call(self, connection: Connection, method: str, params: dict) -> dict: + """调用方法""" + pass + + @abstractmethod + def discover(self, capability: str) -> List[AgentInfo]: + """发现 Agent""" + pass + + @abstractmethod + def close(self, connection: Connection): + """关闭连接""" + pass +``` + +## 支持的协议 + +| 协议 | 状态 | 适配器 | 说明 | +|------|------|--------|------| +| ANP | ✅ 已实现 | `adapters/anp.py` | 去中心化身份 + Agent 网络 | +| MCP | ✅ 已实现 | `adapters/mcp.py` | LLM 工具调用 | +| A2A | ✅ 已实现 | `adapters/a2a.py` | Agent 间协作 | +| AITP | ✅ 已实现 | `adapters/aitp.py` | 交互 + 交易 | +| Agent Protocol | ✅ 已实现 | `adapters/agent_protocol.py` | REST API | +| LMOS | ✅ 已实现 | `adapters/lmos.py` | 企业级平台 | + +## 使用方式 + +### CLI 调用 + +```bash +# 调用 ANP Agent +python scripts/uni_cli.py call amap@anp maps_weather '{"city":"北京"}' + +# 调用 MCP Server +python scripts/uni_cli.py call filesystem@mcp read_file '{"path":"/tmp/a.txt"}' + +# 发现 Agent +python scripts/uni_cli.py discover weather + +# 列出已注册 Agent +python scripts/uni_cli.py list +``` + +### Python SDK + +```python +from uni_agent import UniAgent + +# 初始化 +agent = UniAgent(config_path="config/agents.yaml") + +# 调用 +result = agent.call("amap@anp", "maps_weather", {"city": "北京"}) +print(result) + +# 批量调用 +results = agent.batch_call([ + ("amap@anp", "maps_weather", {"city": "北京"}), + ("amap@anp", "maps_weather", {"city": "上海"}), +]) +``` + +## 配置文件 + +### config/agents.yaml +```yaml +agents: + # ANP Agents + - id: amap + protocol: anp + ad_url: https://agent-connect.ai/mcp/agents/amap/ad.json + + - id: hotel + protocol: anp + ad_url: https://agent-connect.ai/agents/hotel-assistant/ad.json + + # MCP Servers + - id: filesystem + protocol: mcp + command: npx + args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"] + + - id: github + protocol: mcp + command: npx + args: ["-y", "@modelcontextprotocol/server-github"] + env: + GITHUB_TOKEN: "${GITHUB_TOKEN}" + + # A2A Agents + - id: assistant + protocol: a2a + endpoint: https://example.com/.well-known/agent.json +``` + +### config/identity.yaml +```yaml +# 身份配置(跨协议通用) +identity: + # ANP DID 身份 + anp: + did_document: config/did.json + private_key: config/private-key.pem + + # A2A 认证 + a2a: + auth_type: oauth2 + client_id: "${A2A_CLIENT_ID}" + client_secret: "${A2A_CLIENT_SECRET}" +``` + +## 目录结构 + +``` +uni-agent/ +├── SKILL.md # 本文件 +├── README.md # 使用文档 +├── setup.sh # 一键安装 +├── requirements.txt # Python 依赖 +├── config/ +│ ├── agents.yaml # Agent 注册表 +│ ├── identity.yaml # 身份配置 +│ └── .gitignore +├── adapters/ +│ ├── __init__.py +│ ├── base.py # 适配器基类 +│ ├── anp.py # ANP 适配器 +│ ├── mcp.py # MCP 适配器 +│ ├── a2a.py # A2A 适配器 +│ └── aitp.py # AITP 适配器 +├── scripts/ +│ └── uni_cli.py # CLI 工具 +└── docs/ + ├── architecture.md # 架构文档 + └── adapters.md # 适配器开发指南 +``` + +## 扩展新协议 + +1. 创建适配器文件 `adapters/new_protocol.py` +2. 继承 `ProtocolAdapter` 基类 +3. 实现 `connect`、`call`、`discover`、`close` 方法 +4. 在 `adapters/__init__.py` 注册 + +```python +# adapters/new_protocol.py +from .base import ProtocolAdapter + +class NewProtocolAdapter(ProtocolAdapter): + protocol_name = "new_protocol" + + def connect(self, agent_config): + # 实现连接逻辑 + pass + + def call(self, connection, method, params): + # 实现调用逻辑 + pass + + # ... +``` + +## 依赖 + +```bash +pip install anp aiohttp mcp pyyaml +``` diff --git a/uni-agent/adapters/__init__.py b/uni-agent/adapters/__init__.py new file mode 100644 index 0000000..fc86376 --- /dev/null +++ b/uni-agent/adapters/__init__.py @@ -0,0 +1,60 @@ +""" +UniAgent 协议适配器 + +支持的协议: +- ANP: Agent Network Protocol (去中心化身份 + Agent 网络) +- MCP: Model Context Protocol (LLM 工具调用) +- A2A: Agent-to-Agent (Google Agent 间协作) +- AITP: Agent Interaction & Transaction Protocol (交互 + 交易) +- Agent Protocol: 统一 REST API +- LMOS: Language Model OS (企业级 Agent 平台) +""" + +from .base import ProtocolAdapter, Connection, AgentInfo +from .anp import ANPAdapter +from .mcp import MCPAdapter +from .a2a import A2AAdapter +from .aitp import AITPAdapter +from .agent_protocol import AgentProtocolAdapter +from .lmos import LMOSAdapter + +ADAPTERS = { + "anp": ANPAdapter, + "mcp": MCPAdapter, + "a2a": A2AAdapter, + "aitp": AITPAdapter, + "agent_protocol": AgentProtocolAdapter, + "ap": AgentProtocolAdapter, + "lmos": LMOSAdapter, +} + +def get_adapter(protocol: str) -> ProtocolAdapter: + """获取协议适配器""" + adapter_class = ADAPTERS.get(protocol) + if not adapter_class: + raise ValueError(f"不支持的协议: {protocol},可用协议: {list(ADAPTERS.keys())}") + return adapter_class() + +def register_adapter(protocol: str, adapter_class: type): + """注册新的协议适配器""" + ADAPTERS[protocol] = adapter_class + +def list_protocols() -> list: + """列出所有支持的协议""" + return list(set(ADAPTERS.keys())) + +__all__ = [ + "ProtocolAdapter", + "Connection", + "AgentInfo", + "ANPAdapter", + "MCPAdapter", + "A2AAdapter", + "AITPAdapter", + "AgentProtocolAdapter", + "LMOSAdapter", + "get_adapter", + "register_adapter", + "list_protocols", + "ADAPTERS", +] diff --git a/uni-agent/adapters/a2a.py b/uni-agent/adapters/a2a.py new file mode 100644 index 0000000..7c9821f --- /dev/null +++ b/uni-agent/adapters/a2a.py @@ -0,0 +1,225 @@ +""" +A2A (Agent-to-Agent) 适配器 +Google 提出的 Agent 间协作协议 + +参考: https://github.com/google/a2a +""" + +import json +import uuid +from pathlib import Path +from typing import Any, Dict, List, Optional + +import aiohttp + +from .base import ProtocolAdapter, Connection, AgentInfo + + +class A2AAdapter(ProtocolAdapter): + """A2A 协议适配器""" + + protocol_name = "a2a" + + def __init__(self, config_dir: Optional[Path] = None): + self.config_dir = config_dir or Path(__file__).parent.parent / "config" + self._agent_cards: Dict[str, dict] = {} + + async def _fetch_agent_card(self, endpoint: str) -> dict: + """获取 Agent Card""" + if endpoint in self._agent_cards: + return self._agent_cards[endpoint] + + agent_json_url = endpoint.rstrip("/") + if not agent_json_url.endswith("agent.json"): + agent_json_url = f"{agent_json_url}/.well-known/agent.json" + + async with aiohttp.ClientSession() as session: + async with session.get(agent_json_url, timeout=aiohttp.ClientTimeout(total=15)) as resp: + if resp.status == 200: + card = await resp.json() + self._agent_cards[endpoint] = card + return card + raise Exception(f"获取 Agent Card 失败: HTTP {resp.status}") + + async def connect(self, agent_config: dict) -> Connection: + """建立连接""" + endpoint = agent_config.get("endpoint") + if not endpoint: + raise ValueError("A2A Agent 配置必须包含 endpoint") + + agent_card = await self._fetch_agent_card(endpoint) + + rpc_url = None + if "url" in agent_card: + rpc_url = agent_card["url"] + elif "capabilities" in agent_card: + caps = agent_card.get("capabilities", {}) + if "streaming" in caps: + rpc_url = caps.get("streaming", {}).get("streamingUrl") + + if not rpc_url: + rpc_url = endpoint.rstrip("/") + "/rpc" + + return Connection( + agent_id=agent_config.get("id", ""), + protocol=self.protocol_name, + endpoint=rpc_url, + session=None, + metadata={ + "agent_card": agent_card, + "original_endpoint": endpoint, + "auth": agent_config.get("auth", {}), + } + ) + + async def call( + self, + connection: Connection, + method: str, + params: dict, + timeout: float = 30.0 + ) -> dict: + """调用 A2A Agent 方法""" + rpc_url = connection.endpoint + auth_config = connection.metadata.get("auth", {}) + + headers = { + "Content-Type": "application/json", + } + + if auth_config.get("type") == "api_key": + headers["Authorization"] = f"Bearer {auth_config.get('api_key', '')}" + elif auth_config.get("type") == "oauth2": + token = await self._get_oauth_token(auth_config) + headers["Authorization"] = f"Bearer {token}" + + task_id = str(uuid.uuid4()) + + if method == "tasks/send": + payload = { + "jsonrpc": "2.0", + "id": task_id, + "method": "tasks/send", + "params": { + "id": task_id, + "message": params.get("message", {}), + } + } + elif method == "tasks/get": + payload = { + "jsonrpc": "2.0", + "id": task_id, + "method": "tasks/get", + "params": { + "id": params.get("task_id", task_id), + } + } + else: + payload = { + "jsonrpc": "2.0", + "id": task_id, + "method": method, + "params": params, + } + + async with aiohttp.ClientSession() as session: + async with session.post( + rpc_url, + json=payload, + headers=headers, + timeout=aiohttp.ClientTimeout(total=timeout) + ) as resp: + if resp.status == 200: + result = await resp.json() + return { + "success": True, + "result": result.get("result", result), + "task_id": task_id, + } + else: + error_text = await resp.text() + return { + "success": False, + "error": f"HTTP {resp.status}: {error_text}", + } + + async def _get_oauth_token(self, auth_config: dict) -> str: + """获取 OAuth2 令牌""" + token_url = auth_config.get("token_url") + client_id = auth_config.get("client_id") + client_secret = auth_config.get("client_secret") + + if not all([token_url, client_id, client_secret]): + raise ValueError("OAuth2 配置不完整") + + async with aiohttp.ClientSession() as session: + async with session.post( + token_url, + data={ + "grant_type": "client_credentials", + "client_id": client_id, + "client_secret": client_secret, + } + ) as resp: + if resp.status == 200: + result = await resp.json() + return result.get("access_token", "") + raise Exception(f"获取 OAuth2 令牌失败: HTTP {resp.status}") + + async def discover(self, capability: str = "") -> List[AgentInfo]: + """发现 Agent""" + agents_file = self.config_dir / "agents.yaml" + if not agents_file.exists(): + return [] + + import yaml + with open(agents_file) as f: + config = yaml.safe_load(f) + + agents = [] + for agent in config.get("agents", []): + if agent.get("protocol") != "a2a": + continue + + if capability and capability.lower() not in agent.get("id", "").lower(): + continue + + agents.append(AgentInfo( + id=f"{agent['id']}@a2a", + protocol="a2a", + name=agent.get("name", agent["id"]), + endpoint=agent.get("endpoint", ""), + metadata=agent + )) + + return agents + + async def close(self, connection: Connection): + """关闭连接""" + pass + + async def get_methods(self, connection: Connection) -> List[dict]: + """获取 Agent 支持的方法(从 Agent Card 的 skills)""" + agent_card = connection.metadata.get("agent_card", {}) + skills = agent_card.get("skills", []) + + methods = [] + for skill in skills: + methods.append({ + "name": skill.get("id", skill.get("name", "unknown")), + "description": skill.get("description", ""), + "inputSchema": skill.get("inputSchema", {}), + "outputSchema": skill.get("outputSchema", {}), + }) + + methods.extend([ + {"name": "tasks/send", "description": "发送任务消息"}, + {"name": "tasks/get", "description": "获取任务状态"}, + {"name": "tasks/cancel", "description": "取消任务"}, + ]) + + return methods + + def validate_config(self, agent_config: dict) -> bool: + """验证配置""" + return "endpoint" in agent_config diff --git a/uni-agent/adapters/agent_protocol.py b/uni-agent/adapters/agent_protocol.py new file mode 100644 index 0000000..6e26f78 --- /dev/null +++ b/uni-agent/adapters/agent_protocol.py @@ -0,0 +1,211 @@ +""" +Agent Protocol 适配器 +AI Engineer Foundation 提出的 Agent 统一 REST API + +参考: https://agentprotocol.ai +""" + +import json +import uuid +from pathlib import Path +from typing import Any, Dict, List, Optional + +import aiohttp + +from .base import ProtocolAdapter, Connection, AgentInfo + + +class AgentProtocolAdapter(ProtocolAdapter): + """Agent Protocol 适配器""" + + protocol_name = "agent_protocol" + + def __init__(self, config_dir: Optional[Path] = None): + self.config_dir = config_dir or Path(__file__).parent.parent / "config" + self._tasks: Dict[str, dict] = {} + + async def connect(self, agent_config: dict) -> Connection: + """建立连接""" + endpoint = agent_config.get("endpoint") + if not endpoint: + raise ValueError("Agent Protocol 配置必须包含 endpoint") + + endpoint = endpoint.rstrip("/") + if not endpoint.endswith("/ap/v1"): + endpoint = f"{endpoint}/ap/v1" + + return Connection( + agent_id=agent_config.get("id", ""), + protocol=self.protocol_name, + endpoint=endpoint, + session=None, + metadata=agent_config + ) + + async def call( + self, + connection: Connection, + method: str, + params: dict, + timeout: float = 30.0 + ) -> dict: + """调用 Agent Protocol API""" + endpoint = connection.endpoint + + headers = { + "Content-Type": "application/json", + } + + api_key = connection.metadata.get("api_key") + if api_key: + headers["Authorization"] = f"Bearer {api_key}" + + if method == "create_task": + async with aiohttp.ClientSession() as session: + async with session.post( + f"{endpoint}/agent/tasks", + json={"input": params.get("input", "")}, + headers=headers, + timeout=aiohttp.ClientTimeout(total=timeout) + ) as resp: + if resp.status in [200, 201]: + result = await resp.json() + task_id = result.get("task_id") + self._tasks[task_id] = result + return {"success": True, "result": result, "task_id": task_id} + else: + return {"success": False, "error": f"HTTP {resp.status}"} + + elif method == "execute_step": + task_id = params.get("task_id") + if not task_id: + return {"success": False, "error": "缺少 task_id"} + + async with aiohttp.ClientSession() as session: + async with session.post( + f"{endpoint}/agent/tasks/{task_id}/steps", + json={"input": params.get("input", "")}, + headers=headers, + timeout=aiohttp.ClientTimeout(total=timeout) + ) as resp: + if resp.status in [200, 201]: + result = await resp.json() + return {"success": True, "result": result} + else: + return {"success": False, "error": f"HTTP {resp.status}"} + + elif method == "get_task": + task_id = params.get("task_id") + if not task_id: + return {"success": False, "error": "缺少 task_id"} + + async with aiohttp.ClientSession() as session: + async with session.get( + f"{endpoint}/agent/tasks/{task_id}", + headers=headers, + timeout=aiohttp.ClientTimeout(total=timeout) + ) as resp: + if resp.status == 200: + result = await resp.json() + return {"success": True, "result": result} + else: + return {"success": False, "error": f"HTTP {resp.status}"} + + elif method == "list_tasks": + async with aiohttp.ClientSession() as session: + async with session.get( + f"{endpoint}/agent/tasks", + headers=headers, + timeout=aiohttp.ClientTimeout(total=timeout) + ) as resp: + if resp.status == 200: + result = await resp.json() + return {"success": True, "result": result} + else: + return {"success": False, "error": f"HTTP {resp.status}"} + + elif method == "get_artifacts": + task_id = params.get("task_id") + if not task_id: + return {"success": False, "error": "缺少 task_id"} + + async with aiohttp.ClientSession() as session: + async with session.get( + f"{endpoint}/agent/tasks/{task_id}/artifacts", + headers=headers, + timeout=aiohttp.ClientTimeout(total=timeout) + ) as resp: + if resp.status == 200: + result = await resp.json() + return {"success": True, "result": result} + else: + return {"success": False, "error": f"HTTP {resp.status}"} + + else: + return {"success": False, "error": f"未知方法: {method}"} + + async def discover(self, capability: str = "") -> List[AgentInfo]: + """发现 Agent""" + agents_file = self.config_dir / "agents.yaml" + if not agents_file.exists(): + return [] + + import yaml + with open(agents_file) as f: + config = yaml.safe_load(f) + + agents = [] + for agent in config.get("agents", []): + if agent.get("protocol") != "agent_protocol": + continue + + if capability and capability.lower() not in agent.get("id", "").lower(): + continue + + agents.append(AgentInfo( + id=f"{agent['id']}@agent_protocol", + protocol="agent_protocol", + name=agent.get("name", agent["id"]), + endpoint=agent.get("endpoint", ""), + metadata=agent + )) + + return agents + + async def close(self, connection: Connection): + """关闭连接""" + pass + + async def get_methods(self, connection: Connection) -> List[dict]: + """获取支持的方法""" + return [ + { + "name": "create_task", + "description": "创建新任务", + "inputSchema": {"input": "string"}, + }, + { + "name": "execute_step", + "description": "执行任务步骤", + "inputSchema": {"task_id": "string", "input": "string"}, + }, + { + "name": "get_task", + "description": "获取任务状态", + "inputSchema": {"task_id": "string"}, + }, + { + "name": "list_tasks", + "description": "列出所有任务", + "inputSchema": {}, + }, + { + "name": "get_artifacts", + "description": "获取任务产物", + "inputSchema": {"task_id": "string"}, + }, + ] + + def validate_config(self, agent_config: dict) -> bool: + """验证配置""" + return "endpoint" in agent_config diff --git a/uni-agent/adapters/aitp.py b/uni-agent/adapters/aitp.py new file mode 100644 index 0000000..af003fa --- /dev/null +++ b/uni-agent/adapters/aitp.py @@ -0,0 +1,217 @@ +""" +AITP (Agent Interaction & Transaction Protocol) 适配器 +NEAR 基金会提出的 Agent 交互与交易协议 + +参考: https://aitp.dev +""" + +import json +import uuid +from pathlib import Path +from typing import Any, Dict, List, Optional + +import aiohttp + +from .base import ProtocolAdapter, Connection, AgentInfo + + +class AITPAdapter(ProtocolAdapter): + """AITP 协议适配器""" + + protocol_name = "aitp" + + def __init__(self, config_dir: Optional[Path] = None): + self.config_dir = config_dir or Path(__file__).parent.parent / "config" + self._threads: Dict[str, dict] = {} + + async def connect(self, agent_config: dict) -> Connection: + """建立连接 - 创建 Thread""" + endpoint = agent_config.get("endpoint") + if not endpoint: + raise ValueError("AITP Agent 配置必须包含 endpoint") + + thread_id = str(uuid.uuid4()) + + self._threads[thread_id] = { + "id": thread_id, + "messages": [], + "status": "open", + } + + return Connection( + agent_id=agent_config.get("id", ""), + protocol=self.protocol_name, + endpoint=endpoint, + session=thread_id, + metadata={ + "thread_id": thread_id, + "wallet": agent_config.get("wallet", {}), + } + ) + + async def call( + self, + connection: Connection, + method: str, + params: dict, + timeout: float = 30.0 + ) -> dict: + """调用 AITP Agent""" + endpoint = connection.endpoint + thread_id = connection.session + wallet_config = connection.metadata.get("wallet", {}) + + headers = { + "Content-Type": "application/json", + } + + if method == "message": + payload = { + "thread_id": thread_id, + "message": { + "role": "user", + "content": params.get("content", ""), + "parts": params.get("parts", []), + } + } + elif method == "payment": + payload = { + "thread_id": thread_id, + "capability": "aitp-01", + "payment_request": { + "amount": params.get("amount"), + "currency": params.get("currency", "NEAR"), + "recipient": params.get("recipient"), + "memo": params.get("memo", ""), + } + } + + if wallet_config.get("type") == "near": + payload["wallet"] = { + "type": "near", + "account_id": wallet_config.get("account_id"), + } + elif method == "decision": + payload = { + "thread_id": thread_id, + "capability": "aitp-02", + "decision_request": { + "question": params.get("question"), + "options": params.get("options", []), + "allow_custom": params.get("allow_custom", False), + } + } + elif method == "data_request": + payload = { + "thread_id": thread_id, + "capability": "aitp-03", + "data_request": { + "schema": params.get("schema", {}), + "description": params.get("description", ""), + } + } + else: + payload = { + "thread_id": thread_id, + "method": method, + "params": params, + } + + async with aiohttp.ClientSession() as session: + async with session.post( + f"{endpoint}/threads/{thread_id}/messages", + json=payload, + headers=headers, + timeout=aiohttp.ClientTimeout(total=timeout) + ) as resp: + if resp.status == 200: + result = await resp.json() + + if thread_id in self._threads: + self._threads[thread_id]["messages"].append(payload) + self._threads[thread_id]["messages"].append(result) + + return { + "success": True, + "result": result, + "thread_id": thread_id, + } + else: + error_text = await resp.text() + return { + "success": False, + "error": f"HTTP {resp.status}: {error_text}", + } + + async def discover(self, capability: str = "") -> List[AgentInfo]: + """发现 Agent""" + agents_file = self.config_dir / "agents.yaml" + if not agents_file.exists(): + return [] + + import yaml + with open(agents_file) as f: + config = yaml.safe_load(f) + + agents = [] + for agent in config.get("agents", []): + if agent.get("protocol") != "aitp": + continue + + if capability and capability.lower() not in agent.get("id", "").lower(): + continue + + agents.append(AgentInfo( + id=f"{agent['id']}@aitp", + protocol="aitp", + name=agent.get("name", agent["id"]), + endpoint=agent.get("endpoint", ""), + metadata=agent + )) + + return agents + + async def close(self, connection: Connection): + """关闭连接 - 关闭 Thread""" + thread_id = connection.session + if thread_id in self._threads: + self._threads[thread_id]["status"] = "closed" + + async def get_methods(self, connection: Connection) -> List[dict]: + """获取支持的方法(AITP 能力)""" + return [ + { + "name": "message", + "description": "发送对话消息", + "inputSchema": {"content": "string"}, + }, + { + "name": "payment", + "description": "AITP-01: 发起支付请求", + "inputSchema": { + "amount": "number", + "currency": "string", + "recipient": "string", + }, + }, + { + "name": "decision", + "description": "AITP-02: 请求用户决策", + "inputSchema": { + "question": "string", + "options": "array", + }, + }, + { + "name": "data_request", + "description": "AITP-03: 请求结构化数据", + "inputSchema": { + "schema": "object", + "description": "string", + }, + }, + ] + + def validate_config(self, agent_config: dict) -> bool: + """验证配置""" + return "endpoint" in agent_config diff --git a/uni-agent/adapters/anp.py b/uni-agent/adapters/anp.py new file mode 100644 index 0000000..2b46038 --- /dev/null +++ b/uni-agent/adapters/anp.py @@ -0,0 +1,191 @@ +""" +ANP (Agent Network Protocol) 适配器 +""" + +import json +from pathlib import Path +from typing import Any, Dict, List, Optional + +import aiohttp + +from .base import ProtocolAdapter, Connection, AgentInfo + +try: + from anp.anp_crawler import ANPCrawler + HAS_ANP = True +except ImportError: + HAS_ANP = False + + +class ANPAdapter(ProtocolAdapter): + """ANP 协议适配器""" + + protocol_name = "anp" + + def __init__(self, config_dir: Optional[Path] = None): + self.config_dir = config_dir or Path(__file__).parent.parent / "config" + self._crawler = None + self._ad_cache: Dict[str, dict] = {} + self._endpoint_cache: Dict[str, str] = {} + + def _get_crawler(self) -> "ANPCrawler": + """获取 ANP Crawler 实例""" + if not HAS_ANP: + raise ImportError("请安装 anp 库: pip install anp") + + if self._crawler is None: + did_path = self.config_dir / "did.json" + key_path = self.config_dir / "private-key.pem" + + if did_path.exists() and key_path.exists(): + self._crawler = ANPCrawler( + did_document_path=str(did_path), + private_key_path=str(key_path) + ) + else: + raise FileNotFoundError( + f"DID 配置文件不存在: {did_path} 或 {key_path}\n" + "请运行 setup.sh 生成本地身份" + ) + + return self._crawler + + async def _fetch_ad(self, ad_url: str) -> dict: + """获取 Agent Description 文档""" + if ad_url in self._ad_cache: + return self._ad_cache[ad_url] + + async with aiohttp.ClientSession() as session: + async with session.get(ad_url, timeout=aiohttp.ClientTimeout(total=15)) as resp: + if resp.status == 200: + ad = await resp.json() + self._ad_cache[ad_url] = ad + return ad + raise Exception(f"获取 AD 失败: HTTP {resp.status}") + + async def _get_endpoint(self, ad_url: str) -> str: + """从 AD 获取 RPC 端点""" + if ad_url in self._endpoint_cache: + return self._endpoint_cache[ad_url] + + ad = await self._fetch_ad(ad_url) + interfaces = ad.get("interfaces", []) + + if not interfaces: + raise ValueError(f"AD 中没有定义接口: {ad_url}") + + interface_url = interfaces[0].get("url") + if not interface_url: + raise ValueError(f"接口 URL 为空: {ad_url}") + + async with aiohttp.ClientSession() as session: + async with session.get(interface_url, timeout=aiohttp.ClientTimeout(total=15)) as resp: + if resp.status == 200: + interface_doc = await resp.json() + servers = interface_doc.get("servers", []) + if servers: + endpoint = servers[0].get("url") + self._endpoint_cache[ad_url] = endpoint + return endpoint + + raise ValueError(f"无法获取 RPC 端点: {ad_url}") + + async def connect(self, agent_config: dict) -> Connection: + """建立连接""" + ad_url = agent_config.get("ad_url") + if not ad_url: + raise ValueError("ANP Agent 配置必须包含 ad_url") + + ad = await self._fetch_ad(ad_url) + endpoint = await self._get_endpoint(ad_url) + + return Connection( + agent_id=agent_config.get("id", ""), + protocol=self.protocol_name, + endpoint=endpoint, + session=self._get_crawler(), + metadata={ + "ad_url": ad_url, + "ad": ad, + "name": ad.get("name", ""), + } + ) + + async def call( + self, + connection: Connection, + method: str, + params: dict, + timeout: float = 30.0 + ) -> dict: + """调用 Agent 方法""" + crawler = connection.session + endpoint = connection.endpoint + + result = await crawler.execute_json_rpc( + endpoint=endpoint, + method=method, + params=params + ) + + return result + + async def discover(self, capability: str = "") -> List[AgentInfo]: + """发现 Agent(从本地配置)""" + agents_file = self.config_dir / "agents.yaml" + if not agents_file.exists(): + return [] + + import yaml + with open(agents_file) as f: + config = yaml.safe_load(f) + + agents = [] + for agent in config.get("agents", []): + if agent.get("protocol") != "anp": + continue + + if capability and capability.lower() not in agent.get("id", "").lower(): + continue + + agents.append(AgentInfo( + id=f"{agent['id']}@anp", + protocol="anp", + name=agent.get("name", agent["id"]), + endpoint=agent.get("ad_url", ""), + metadata=agent + )) + + return agents + + async def close(self, connection: Connection): + """关闭连接""" + pass + + async def get_methods(self, connection: Connection) -> List[dict]: + """获取 Agent 支持的方法""" + ad_url = connection.metadata.get("ad_url") + if not ad_url: + return [] + + ad = await self._fetch_ad(ad_url) + interfaces = ad.get("interfaces", []) + + if not interfaces: + return [] + + interface_url = interfaces[0].get("url") + if not interface_url: + return [] + + async with aiohttp.ClientSession() as session: + async with session.get(interface_url, timeout=aiohttp.ClientTimeout(total=15)) as resp: + if resp.status == 200: + interface_doc = await resp.json() + return interface_doc.get("methods", []) + + return [] + + def validate_config(self, agent_config: dict) -> bool: + """验证配置""" + return "ad_url" in agent_config diff --git a/uni-agent/adapters/base.py b/uni-agent/adapters/base.py new file mode 100644 index 0000000..275959d --- /dev/null +++ b/uni-agent/adapters/base.py @@ -0,0 +1,120 @@ +""" +协议适配器基类 +""" + +from abc import ABC, abstractmethod +from dataclasses import dataclass, field +from typing import Any, Dict, List, Optional + + +@dataclass +class AgentInfo: + """Agent 信息""" + id: str + protocol: str + name: str = "" + description: str = "" + methods: List[str] = field(default_factory=list) + endpoint: str = "" + metadata: Dict[str, Any] = field(default_factory=dict) + + +@dataclass +class Connection: + """连接对象""" + agent_id: str + protocol: str + endpoint: str = "" + session: Any = None + metadata: Dict[str, Any] = field(default_factory=dict) + + def is_active(self) -> bool: + return self.session is not None + + +class ProtocolAdapter(ABC): + """协议适配器基类""" + + protocol_name: str = "base" + + @abstractmethod + async def connect(self, agent_config: dict) -> Connection: + """ + 建立与 Agent 的连接 + + Args: + agent_config: Agent 配置信息 + + Returns: + Connection 对象 + """ + pass + + @abstractmethod + async def call( + self, + connection: Connection, + method: str, + params: dict, + timeout: float = 30.0 + ) -> dict: + """ + 调用 Agent 方法 + + Args: + connection: 连接对象 + method: 方法名 + params: 参数 + timeout: 超时时间(秒) + + Returns: + 调用结果 + """ + pass + + @abstractmethod + async def discover(self, capability: str = "") -> List[AgentInfo]: + """ + 发现 Agent + + Args: + capability: 能力关键词(可选) + + Returns: + Agent 信息列表 + """ + pass + + @abstractmethod + async def close(self, connection: Connection): + """ + 关闭连接 + + Args: + connection: 连接对象 + """ + pass + + async def get_methods(self, connection: Connection) -> List[dict]: + """ + 获取 Agent 支持的方法列表 + + Args: + connection: 连接对象 + + Returns: + 方法列表 + """ + return [] + + def validate_config(self, agent_config: dict) -> bool: + """ + 验证 Agent 配置 + + Args: + agent_config: Agent 配置 + + Returns: + 是否有效 + """ + return True diff --git a/uni-agent/adapters/lmos.py b/uni-agent/adapters/lmos.py new file mode 100644 index 0000000..b272325 --- /dev/null +++ b/uni-agent/adapters/lmos.py @@ -0,0 +1,215 @@ +""" +LMOS (Language Model Operating System) 适配器 +Eclipse 基金会孵化的企业级多 Agent 平台 + +参考: https://eclipse.dev/lmos/ +""" + +import json +import uuid +from pathlib import Path +from typing import Any, Dict, List, Optional + +import aiohttp + +from .base import ProtocolAdapter, Connection, AgentInfo + + +class LMOSAdapter(ProtocolAdapter): + """LMOS 协议适配器""" + + protocol_name = "lmos" + + def __init__(self, config_dir: Optional[Path] = None): + self.config_dir = config_dir or Path(__file__).parent.parent / "config" + self._registry_cache: Dict[str, List[dict]] = {} + + async def _discover_via_mdns(self) -> List[dict]: + """通过 mDNS 发现本地 Agent(简化实现)""" + return [] + + async def _query_registry(self, registry_url: str, capability: str = "") -> List[dict]: + """查询 Agent 注册中心""" + if registry_url in self._registry_cache: + return self._registry_cache[registry_url] + + async with aiohttp.ClientSession() as session: + params = {} + if capability: + params["capability"] = capability + + async with session.get( + f"{registry_url}/agents", + params=params, + timeout=aiohttp.ClientTimeout(total=15) + ) as resp: + if resp.status == 200: + result = await resp.json() + agents = result.get("agents", []) + self._registry_cache[registry_url] = agents + return agents + return [] + + async def connect(self, agent_config: dict) -> Connection: + """建立连接""" + endpoint = agent_config.get("endpoint") + registry_url = agent_config.get("registry_url") + + if not endpoint and not registry_url: + raise ValueError("LMOS Agent 配置必须包含 endpoint 或 registry_url") + + if registry_url and not endpoint: + agent_id = agent_config.get("id") + agents = await self._query_registry(registry_url) + for agent in agents: + if agent.get("id") == agent_id: + endpoint = agent.get("endpoint") + break + + if not endpoint: + raise ValueError(f"在注册中心未找到 Agent: {agent_id}") + + return Connection( + agent_id=agent_config.get("id", ""), + protocol=self.protocol_name, + endpoint=endpoint, + session=None, + metadata={ + "registry_url": registry_url, + "group": agent_config.get("group"), + } + ) + + async def call( + self, + connection: Connection, + method: str, + params: dict, + timeout: float = 30.0 + ) -> dict: + """调用 LMOS Agent""" + endpoint = connection.endpoint + + headers = { + "Content-Type": "application/json", + } + + if method == "invoke": + payload = { + "capability": params.get("capability"), + "input": params.get("input", {}), + "context": params.get("context", {}), + } + elif method == "route": + payload = { + "query": params.get("query"), + "context": params.get("context", {}), + } + elif method == "describe": + async with aiohttp.ClientSession() as session: + async with session.get( + f"{endpoint}/capabilities", + headers=headers, + timeout=aiohttp.ClientTimeout(total=timeout) + ) as resp: + if resp.status == 200: + result = await resp.json() + return {"success": True, "result": result} + else: + return {"success": False, "error": f"HTTP {resp.status}"} + else: + payload = { + "method": method, + "params": params, + } + + async with aiohttp.ClientSession() as session: + async with session.post( + f"{endpoint}/invoke", + json=payload, + headers=headers, + timeout=aiohttp.ClientTimeout(total=timeout) + ) as resp: + if resp.status == 200: + result = await resp.json() + return {"success": True, "result": result} + else: + error_text = await resp.text() + return {"success": False, "error": f"HTTP {resp.status}: {error_text}"} + + async def discover(self, capability: str = "") -> List[AgentInfo]: + """发现 Agent""" + agents_file = self.config_dir / "agents.yaml" + if not agents_file.exists(): + return [] + + import yaml + with open(agents_file) as f: + config = yaml.safe_load(f) + + all_agents = [] + + for agent in config.get("agents", []): + if agent.get("protocol") != "lmos": + continue + + if capability and capability.lower() not in agent.get("id", "").lower(): + continue + + all_agents.append(AgentInfo( + id=f"{agent['id']}@lmos", + protocol="lmos", + name=agent.get("name", agent["id"]), + endpoint=agent.get("endpoint", ""), + metadata=agent + )) + + for agent in config.get("agents", []): + if agent.get("protocol") != "lmos": + continue + + registry_url = agent.get("registry_url") + if registry_url: + try: + remote_agents = await self._query_registry(registry_url, capability) + for ra in remote_agents: + all_agents.append(AgentInfo( + id=f"{ra['id']}@lmos", + protocol="lmos", + name=ra.get("name", ra["id"]), + endpoint=ra.get("endpoint", ""), + metadata=ra + )) + except Exception: + pass + + return all_agents + + async def close(self, connection: Connection): + """关闭连接""" + pass + + async def get_methods(self, connection: Connection) -> List[dict]: + """获取支持的方法""" + result = await self.call(connection, "describe", {}) + + if result.get("success"): + capabilities = result.get("result", {}).get("capabilities", []) + return [ + { + "name": cap.get("id", cap.get("name")), + "description": cap.get("description", ""), + "inputSchema": cap.get("inputSchema", {}), + } + for cap in capabilities + ] + + return [ + {"name": "invoke", "description": "调用 Agent 能力"}, + {"name": "route", "description": "智能路由到最佳 Agent"}, + {"name": "describe", "description": "获取 Agent 能力描述"}, + ] + + def validate_config(self, agent_config: dict) -> bool: + """验证配置""" + return "endpoint" in agent_config or "registry_url" in agent_config diff --git a/uni-agent/adapters/mcp.py b/uni-agent/adapters/mcp.py new file mode 100644 index 0000000..d60414f --- /dev/null +++ b/uni-agent/adapters/mcp.py @@ -0,0 +1,159 @@ +""" +MCP (Model Context Protocol) 适配器 +""" + +import asyncio +import json +import os +import subprocess +from pathlib import Path +from typing import Any, Dict, List, Optional + +from .base import ProtocolAdapter, Connection, AgentInfo + +try: + from mcp import ClientSession, StdioServerParameters + from mcp.client.stdio import stdio_client + HAS_MCP = True +except ImportError: + HAS_MCP = False + + +class MCPAdapter(ProtocolAdapter): + """MCP 协议适配器""" + + protocol_name = "mcp" + + def __init__(self, config_dir: Optional[Path] = None): + self.config_dir = config_dir or Path(__file__).parent.parent / "config" + self._sessions: Dict[str, Any] = {} + + async def connect(self, agent_config: dict) -> Connection: + """建立连接""" + if not HAS_MCP: + raise ImportError("请安装 mcp 库: pip install mcp") + + command = agent_config.get("command") + args = agent_config.get("args", []) + env = agent_config.get("env", {}) + + if not command: + raise ValueError("MCP Agent 配置必须包含 command") + + full_env = os.environ.copy() + for k, v in env.items(): + if v.startswith("${") and v.endswith("}"): + env_var = v[2:-1] + full_env[k] = os.environ.get(env_var, "") + else: + full_env[k] = v + + server_params = StdioServerParameters( + command=command, + args=args, + env=full_env + ) + + read, write = await stdio_client(server_params).__aenter__() + session = ClientSession(read, write) + await session.__aenter__() + await session.initialize() + + agent_id = agent_config.get("id", "") + self._sessions[agent_id] = { + "session": session, + "read": read, + "write": write, + } + + return Connection( + agent_id=agent_id, + protocol=self.protocol_name, + endpoint=f"{command} {' '.join(args)}", + session=session, + metadata=agent_config + ) + + async def call( + self, + connection: Connection, + method: str, + params: dict, + timeout: float = 30.0 + ) -> dict: + """调用 MCP 工具""" + session: ClientSession = connection.session + + result = await asyncio.wait_for( + session.call_tool(method, params), + timeout=timeout + ) + + if hasattr(result, "content"): + content = result.content + if isinstance(content, list) and len(content) > 0: + first = content[0] + if hasattr(first, "text"): + return {"success": True, "result": first.text} + return {"success": True, "result": str(first)} + return {"success": True, "result": content} + + return {"success": True, "result": result} + + async def discover(self, capability: str = "") -> List[AgentInfo]: + """发现 Agent(从本地配置)""" + agents_file = self.config_dir / "agents.yaml" + if not agents_file.exists(): + return [] + + import yaml + with open(agents_file) as f: + config = yaml.safe_load(f) + + agents = [] + for agent in config.get("agents", []): + if agent.get("protocol") != "mcp": + continue + + if capability and capability.lower() not in agent.get("id", "").lower(): + continue + + agents.append(AgentInfo( + id=f"{agent['id']}@mcp", + protocol="mcp", + name=agent.get("name", agent["id"]), + endpoint=f"{agent.get('command', '')} {' '.join(agent.get('args', []))}", + metadata=agent + )) + + return agents + + async def close(self, connection: Connection): + """关闭连接""" + agent_id = connection.agent_id + if agent_id in self._sessions: + session_info = self._sessions.pop(agent_id) + session = session_info.get("session") + if session: + await session.__aexit__(None, None, None) + + async def get_methods(self, connection: Connection) -> List[dict]: + """获取 MCP Server 支持的工具""" + session: ClientSession = connection.session + + result = await session.list_tools() + + tools = [] + if hasattr(result, "tools"): + for tool in result.tools: + tools.append({ + "name": tool.name, + "description": getattr(tool, "description", ""), + "inputSchema": getattr(tool, "inputSchema", {}), + }) + + return tools + + def validate_config(self, agent_config: dict) -> bool: + """验证配置""" + return "command" in agent_config diff --git a/uni-agent/config/agents.yaml b/uni-agent/config/agents.yaml new file mode 100644 index 0000000..24b3096 --- /dev/null +++ b/uni-agent/config/agents.yaml @@ -0,0 +1,121 @@ +# UniAgent 配置文件 +# Agent 注册表 + +agents: + # ==================== ANP Agents ==================== + - id: amap + protocol: anp + name: 高德地图 + ad_url: https://agent-connect.ai/mcp/agents/amap/ad.json + description: 地点搜索、路线规划、天气查询、周边搜索 + + - id: kuaidi + protocol: anp + name: 快递查询 + ad_url: https://agent-connect.ai/mcp/agents/kuaidi/ad.json + description: 快递单号追踪 + + - id: hotel + protocol: anp + name: 酒店预订 + ad_url: https://agent-connect.ai/agents/hotel-assistant/ad.json + description: 搜索酒店、查询房价 + + - id: juhe + protocol: anp + name: 聚合查询 + ad_url: https://agent-connect.ai/mcp/agents/juhe/ad.json + description: 多种生活服务查询 + + - id: navigation + protocol: anp + name: Agent导航 + ad_url: https://agent-search.ai/agents/navigation/ad.json + description: 发现更多 ANP Agent + + # ==================== MCP Servers ==================== + # - id: filesystem + # protocol: mcp + # name: 文件系统 + # command: npx + # args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"] + # description: 文件读写操作 + + # - id: github + # protocol: mcp + # name: GitHub + # command: npx + # args: ["-y", "@modelcontextprotocol/server-github"] + # env: + # GITHUB_TOKEN: "${GITHUB_TOKEN}" + # description: GitHub 仓库操作 + + # - id: sqlite + # protocol: mcp + # name: SQLite + # command: npx + # args: ["-y", "@modelcontextprotocol/server-sqlite", "/tmp/test.db"] + # description: SQLite 数据库操作 + + # ==================== A2A Agents ==================== + # Google Agent-to-Agent 协议 + # - id: gemini_assistant + # protocol: a2a + # name: Gemini Assistant + # endpoint: https://example.com/.well-known/agent.json + # auth: + # type: api_key + # api_key: "${A2A_API_KEY}" + + # - id: vertexai_agent + # protocol: a2a + # name: VertexAI Agent + # endpoint: https://your-project.cloudfunctions.net/agent + # auth: + # type: oauth2 + # token_url: https://oauth2.googleapis.com/token + # client_id: "${GOOGLE_CLIENT_ID}" + # client_secret: "${GOOGLE_CLIENT_SECRET}" + + # ==================== AITP Agents ==================== + # NEAR Agent Interaction & Transaction Protocol + # - id: near_shop + # protocol: aitp + # name: NEAR Shop + # endpoint: https://example.near.ai/api + # wallet: + # type: near + # account_id: "${NEAR_ACCOUNT_ID}" + + # - id: payment_agent + # protocol: aitp + # name: Payment Agent + # endpoint: https://pay.example.com/aitp + # description: 支持 NEAR/ETH 支付的 Agent + + # ==================== Agent Protocol ==================== + # AI Engineer Foundation REST API 标准 + # - id: autogpt + # protocol: agent_protocol + # name: AutoGPT + # endpoint: http://localhost:8000 + # api_key: "${AUTOGPT_API_KEY}" + + # - id: smol_developer + # protocol: ap + # name: Smol Developer + # endpoint: http://localhost:8080 + + # ==================== LMOS Agents ==================== + # Eclipse 企业级 Agent 平台 + # - id: customer_service + # protocol: lmos + # name: 客服 Agent + # registry_url: http://lmos-registry.internal:8080 + # group: customer-agents + + # - id: sales_agent + # protocol: lmos + # name: 销售 Agent + # endpoint: http://sales-agent.internal:8080 + # description: 处理销售咨询 diff --git a/uni-agent/requirements.txt b/uni-agent/requirements.txt new file mode 100644 index 0000000..6377b68 --- /dev/null +++ b/uni-agent/requirements.txt @@ -0,0 +1,14 @@ +# UniAgent 依赖 + +# 核心 +pyyaml>=6.0 +aiohttp>=3.8.0 + +# ANP 协议 +anp>=0.1.0 + +# MCP 协议 (可选) +# mcp>=0.1.0 + +# A2A 协议 (待实现) +# google-a2a>=0.1.0 diff --git a/uni-agent/scripts/test_adapters.py b/uni-agent/scripts/test_adapters.py new file mode 100644 index 0000000..63bfbae --- /dev/null +++ b/uni-agent/scripts/test_adapters.py @@ -0,0 +1,282 @@ +#!/usr/bin/env python3 +""" +UniAgent 适配器测试脚本 +测试所有协议适配器的基本功能 +""" + +import asyncio +import json +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).parent.parent)) + +from adapters import get_adapter, ADAPTERS, list_protocols + + +class TestResult: + def __init__(self, protocol: str): + self.protocol = protocol + self.passed = 0 + self.failed = 0 + self.skipped = 0 + self.errors = [] + + def pass_(self, msg: str): + self.passed += 1 + print(f" ✅ {msg}") + + def fail(self, msg: str, error: str = ""): + self.failed += 1 + self.errors.append(f"{msg}: {error}") + print(f" ❌ {msg}: {error[:100]}") + + def skip(self, msg: str): + self.skipped += 1 + print(f" ⏭️ {msg} (跳过)") + + def summary(self) -> str: + status = "✅" if self.failed == 0 else "❌" + return f"{status} {self.protocol}: {self.passed} passed, {self.failed} failed, {self.skipped} skipped" + + +async def test_anp() -> TestResult: + """测试 ANP 适配器""" + result = TestResult("ANP") + print("\n[ANP] 测试 Agent Network Protocol...\n") + + try: + adapter = get_adapter("anp") + result.pass_("获取适配器") + except Exception as e: + result.fail("获取适配器", str(e)) + return result + + agent_config = { + "id": "amap", + "protocol": "anp", + "ad_url": "https://agent-connect.ai/mcp/agents/amap/ad.json" + } + + try: + connection = await adapter.connect(agent_config) + result.pass_(f"建立连接: {connection.endpoint[:50]}...") + except Exception as e: + result.fail("建立连接", str(e)) + return result + + try: + methods = await adapter.get_methods(connection) + result.pass_(f"获取方法列表: {len(methods)} 个方法") + except Exception as e: + result.fail("获取方法列表", str(e)) + + try: + res = await adapter.call(connection, "maps_weather", {"city": "北京"}) + if res.get("success") or res.get("result"): + city = res.get("result", {}).get("city", "") + result.pass_(f"调用 maps_weather: {city}") + else: + result.fail("调用 maps_weather", str(res)) + except Exception as e: + result.fail("调用 maps_weather", str(e)) + + try: + res = await adapter.call(connection, "maps_text_search", {"keywords": "咖啡厅", "city": "上海"}) + if res.get("success") or res.get("result"): + pois = res.get("result", {}).get("pois", []) + result.pass_(f"调用 maps_text_search: 找到 {len(pois)} 个结果") + else: + result.fail("调用 maps_text_search", str(res)) + except Exception as e: + result.fail("调用 maps_text_search", str(e)) + + try: + agents = await adapter.discover() + result.pass_(f"发现 Agent: {len(agents)} 个") + except Exception as e: + result.fail("发现 Agent", str(e)) + + try: + await adapter.close(connection) + result.pass_("关闭连接") + except Exception as e: + result.fail("关闭连接", str(e)) + + return result + + +async def test_mcp() -> TestResult: + """测试 MCP 适配器""" + result = TestResult("MCP") + print("\n[MCP] 测试 Model Context Protocol...\n") + + try: + adapter = get_adapter("mcp") + result.pass_("获取适配器") + except Exception as e: + result.fail("获取适配器", str(e)) + return result + + result.skip("MCP 需要本地 npx 环境,跳过实际连接测试") + result.skip("如需测试,请配置 config/agents.yaml 中的 MCP Server") + + return result + + +async def test_a2a() -> TestResult: + """测试 A2A 适配器""" + result = TestResult("A2A") + print("\n[A2A] 测试 Agent-to-Agent Protocol...\n") + + try: + adapter = get_adapter("a2a") + result.pass_("获取适配器") + except Exception as e: + result.fail("获取适配器", str(e)) + return result + + result.skip("A2A 需要配置 Agent endpoint,跳过实际连接测试") + result.skip("如需测试,请配置 config/agents.yaml 中的 A2A Agent") + + try: + agents = await adapter.discover() + result.pass_(f"发现 Agent: {len(agents)} 个 (本地配置)") + except Exception as e: + result.fail("发现 Agent", str(e)) + + return result + + +async def test_aitp() -> TestResult: + """测试 AITP 适配器""" + result = TestResult("AITP") + print("\n[AITP] 测试 Agent Interaction & Transaction Protocol...\n") + + try: + adapter = get_adapter("aitp") + result.pass_("获取适配器") + except Exception as e: + result.fail("获取适配器", str(e)) + return result + + result.skip("AITP 需要配置 NEAR 钱包和 endpoint,跳过实际连接测试") + result.skip("如需测试,请配置 config/agents.yaml 中的 AITP Agent") + + try: + agents = await adapter.discover() + result.pass_(f"发现 Agent: {len(agents)} 个 (本地配置)") + except Exception as e: + result.fail("发现 Agent", str(e)) + + try: + methods = [ + {"name": "message", "desc": "发送消息"}, + {"name": "payment", "desc": "发起支付"}, + {"name": "decision", "desc": "请求决策"}, + ] + result.pass_(f"支持方法: {', '.join([m['name'] for m in methods])}") + except Exception as e: + result.fail("检查方法", str(e)) + + return result + + +async def test_agent_protocol() -> TestResult: + """测试 Agent Protocol 适配器""" + result = TestResult("Agent Protocol") + print("\n[AP] 测试 Agent Protocol...\n") + + try: + adapter = get_adapter("agent_protocol") + result.pass_("获取适配器 (agent_protocol)") + except Exception as e: + result.fail("获取适配器", str(e)) + return result + + try: + adapter2 = get_adapter("ap") + result.pass_("获取适配器 (别名 ap)") + except Exception as e: + result.fail("获取适配器别名", str(e)) + + result.skip("Agent Protocol 需要运行中的 Agent 服务,跳过实际连接测试") + result.skip("如需测试,请启动 AutoGPT 或其他兼容服务") + + try: + agents = await adapter.discover() + result.pass_(f"发现 Agent: {len(agents)} 个 (本地配置)") + except Exception as e: + result.fail("发现 Agent", str(e)) + + return result + + +async def test_lmos() -> TestResult: + """测试 LMOS 适配器""" + result = TestResult("LMOS") + print("\n[LMOS] 测试 Language Model Operating System...\n") + + try: + adapter = get_adapter("lmos") + result.pass_("获取适配器") + except Exception as e: + result.fail("获取适配器", str(e)) + return result + + result.skip("LMOS 需要配置注册中心或 Agent endpoint,跳过实际连接测试") + result.skip("如需测试,请配置 config/agents.yaml 中的 LMOS Agent") + + try: + agents = await adapter.discover() + result.pass_(f"发现 Agent: {len(agents)} 个 (本地配置)") + except Exception as e: + result.fail("发现 Agent", str(e)) + + return result + + +async def main(): + print("=" * 60) + print(" UniAgent 适配器测试") + print("=" * 60) + + print(f"\n支持的协议: {list_protocols()}\n") + + results = [] + + results.append(await test_anp()) + results.append(await test_mcp()) + results.append(await test_a2a()) + results.append(await test_aitp()) + results.append(await test_agent_protocol()) + results.append(await test_lmos()) + + print("\n" + "=" * 60) + print(" 测试汇总") + print("=" * 60 + "\n") + + total_passed = 0 + total_failed = 0 + total_skipped = 0 + + for r in results: + print(r.summary()) + total_passed += r.passed + total_failed += r.failed + total_skipped += r.skipped + + print(f"\n总计: {total_passed} passed, {total_failed} failed, {total_skipped} skipped") + + if total_failed > 0: + print("\n失败详情:") + for r in results: + for err in r.errors: + print(f" - [{r.protocol}] {err}") + sys.exit(1) + else: + print("\n🎉 所有测试通过!") + + +if __name__ == "__main__": + asyncio.run(main()) diff --git a/uni-agent/scripts/test_all.py b/uni-agent/scripts/test_all.py new file mode 100644 index 0000000..d5a75f5 --- /dev/null +++ b/uni-agent/scripts/test_all.py @@ -0,0 +1,368 @@ +#!/usr/bin/env python3 +""" +UniAgent 完整测试脚本 +启动测试服务器,测试所有协议的真实交互 +""" + +import asyncio +import json +import subprocess +import sys +import time +import signal +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).parent.parent)) + +from adapters import get_adapter + + +SERVERS = {} + + +def start_server(name: str, script: str, port: int) -> subprocess.Popen: + """启动测试服务器""" + script_path = Path(__file__).parent.parent / "test_servers" / script + proc = subprocess.Popen( + [sys.executable, str(script_path)], + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + ) + time.sleep(0.5) + if proc.poll() is not None: + stderr = proc.stderr.read().decode() + print(f" ❌ {name} 启动失败: {stderr}") + return None + print(f" ✅ {name} 启动成功 (port {port})") + return proc + + +def stop_servers(): + """停止所有服务器""" + for name, proc in SERVERS.items(): + if proc and proc.poll() is None: + proc.terminate() + proc.wait(timeout=2) + + +async def test_anp(): + """测试 ANP 适配器""" + print("\n" + "=" * 50) + print("[ANP] Agent Network Protocol") + print("=" * 50) + + adapter = get_adapter("anp") + config = { + "id": "amap", + "protocol": "anp", + "ad_url": "https://agent-connect.ai/mcp/agents/amap/ad.json" + } + + try: + conn = await adapter.connect(config) + print(f"✅ 连接成功: {conn.endpoint[:50]}...") + + result = await adapter.call(conn, "maps_weather", {"city": "北京"}) + city = result.get("result", {}).get("city", "") + print(f"✅ maps_weather: {city}") + + result = await adapter.call(conn, "maps_text_search", {"keywords": "咖啡厅", "city": "上海"}) + pois = result.get("result", {}).get("pois", []) + print(f"✅ maps_text_search: 找到 {len(pois)} 个结果") + + await adapter.close(conn) + print(f"✅ 关闭连接") + return True + except Exception as e: + print(f"❌ 测试失败: {e}") + return False + + +async def test_a2a(): + """测试 A2A 适配器""" + print("\n" + "=" * 50) + print("[A2A] Agent-to-Agent Protocol") + print("=" * 50) + + adapter = get_adapter("a2a") + config = { + "id": "test_agent", + "protocol": "a2a", + "endpoint": "http://localhost:8100" + } + + try: + conn = await adapter.connect(config) + print(f"✅ 连接成功") + + methods = await adapter.get_methods(conn) + print(f"✅ 获取方法: {len(methods)} 个 (包含 skills)") + + result = await adapter.call(conn, "tasks/send", { + "message": { + "role": "user", + "parts": [{"type": "text", "text": "Hello A2A!"}] + } + }) + if result.get("success"): + task = result.get("result", {}) + history = task.get("history", []) + if len(history) >= 2: + response = history[-1].get("parts", [{}])[0].get("text", "") + print(f"✅ tasks/send: {response}") + else: + print(f"✅ tasks/send: 任务已创建") + else: + print(f"❌ tasks/send 失败: {result}") + return False + + await adapter.close(conn) + print(f"✅ 关闭连接") + return True + except Exception as e: + print(f"❌ 测试失败: {e}") + return False + + +async def test_aitp(): + """测试 AITP 适配器""" + print("\n" + "=" * 50) + print("[AITP] Agent Interaction & Transaction Protocol") + print("=" * 50) + + adapter = get_adapter("aitp") + config = { + "id": "test_shop", + "protocol": "aitp", + "endpoint": "http://localhost:8101" + } + + try: + conn = await adapter.connect(config) + print(f"✅ 连接成功 (Thread: {conn.session[:8]}...)") + + result = await adapter.call(conn, "message", {"content": "Hello AITP!"}) + if result.get("success"): + response = result.get("result", {}).get("content", "") + print(f"✅ message: {response}") + else: + print(f"❌ message 失败") + return False + + result = await adapter.call(conn, "payment", { + "amount": 10, + "currency": "NEAR", + "recipient": "shop.near" + }) + if result.get("success"): + payment = result.get("result", {}).get("payment_response", {}) + status = payment.get("status", "") + tx_id = payment.get("transaction_id", "")[:8] + print(f"✅ payment: {status} (tx: {tx_id}...)") + else: + print(f"❌ payment 失败") + return False + + result = await adapter.call(conn, "decision", { + "question": "选择颜色", + "options": ["红色", "蓝色", "绿色"] + }) + if result.get("success"): + decision = result.get("result", {}).get("decision_response", {}) + selected = decision.get("selected", "") + print(f"✅ decision: 选择了 {selected}") + else: + print(f"❌ decision 失败") + return False + + await adapter.close(conn) + print(f"✅ 关闭连接") + return True + except Exception as e: + print(f"❌ 测试失败: {e}") + return False + + +async def test_agent_protocol(): + """测试 Agent Protocol 适配器""" + print("\n" + "=" * 50) + print("[AP] Agent Protocol") + print("=" * 50) + + adapter = get_adapter("agent_protocol") + config = { + "id": "test_agent", + "protocol": "agent_protocol", + "endpoint": "http://localhost:8102" + } + + try: + conn = await adapter.connect(config) + print(f"✅ 连接成功") + + result = await adapter.call(conn, "create_task", {"input": "Hello Agent Protocol!"}) + if result.get("success"): + task_id = result.get("task_id", "") + print(f"✅ create_task: {task_id[:8]}...") + else: + print(f"❌ create_task 失败") + return False + + result = await adapter.call(conn, "execute_step", { + "task_id": task_id, + "input": "Process this" + }) + if result.get("success"): + step = result.get("result", {}) + output = step.get("output", "") + print(f"✅ execute_step: {output}") + else: + print(f"❌ execute_step 失败") + return False + + result = await adapter.call(conn, "get_task", {"task_id": task_id}) + if result.get("success"): + task = result.get("result", {}) + status = task.get("status", "") + print(f"✅ get_task: status={status}") + else: + print(f"❌ get_task 失败") + return False + + result = await adapter.call(conn, "get_artifacts", {"task_id": task_id}) + if result.get("success"): + artifacts = result.get("result", {}).get("artifacts", []) + print(f"✅ get_artifacts: {len(artifacts)} 个产物") + else: + print(f"❌ get_artifacts 失败") + return False + + await adapter.close(conn) + print(f"✅ 关闭连接") + return True + except Exception as e: + print(f"❌ 测试失败: {e}") + return False + + +async def test_lmos(): + """测试 LMOS 适配器""" + print("\n" + "=" * 50) + print("[LMOS] Language Model Operating System") + print("=" * 50) + + adapter = get_adapter("lmos") + config = { + "id": "calculator", + "protocol": "lmos", + "endpoint": "http://localhost:8103/agents/calculator" + } + + try: + conn = await adapter.connect(config) + print(f"✅ 连接成功") + + result = await adapter.call(conn, "invoke", { + "capability": "add", + "input": {"a": 10, "b": 20} + }) + if result.get("success"): + output = result.get("result", {}).get("output", {}) + calc_result = output.get("result", "") + print(f"✅ invoke add(10, 20): {calc_result}") + else: + print(f"❌ invoke add 失败") + return False + + result = await adapter.call(conn, "invoke", { + "capability": "multiply", + "input": {"a": 6, "b": 7} + }) + if result.get("success"): + output = result.get("result", {}).get("output", {}) + calc_result = output.get("result", "") + print(f"✅ invoke multiply(6, 7): {calc_result}") + else: + print(f"❌ invoke multiply 失败") + return False + + greeter_config = { + "id": "greeter", + "protocol": "lmos", + "endpoint": "http://localhost:8103/agents/greeter" + } + conn2 = await adapter.connect(greeter_config) + result = await adapter.call(conn2, "invoke", { + "capability": "greet", + "input": {"name": "test_user"} + }) + if result.get("success"): + output = result.get("result", {}).get("output", {}) + greeting = output.get("greeting", "") + print(f"✅ invoke greet: {greeting}") + else: + print(f"❌ invoke greet 失败") + return False + + await adapter.close(conn) + await adapter.close(conn2) + print(f"✅ 关闭连接") + return True + except Exception as e: + print(f"❌ 测试失败: {e}") + return False + + +async def main(): + print("=" * 60) + print(" UniAgent 完整交互测试") + print("=" * 60) + + print("\n[1] 启动测试服务器...") + SERVERS["A2A"] = start_server("A2A Server", "a2a_server.py", 8100) + SERVERS["AITP"] = start_server("AITP Server", "aitp_server.py", 8101) + SERVERS["AP"] = start_server("Agent Protocol Server", "agent_protocol_server.py", 8102) + SERVERS["LMOS"] = start_server("LMOS Server", "lmos_server.py", 8103) + + time.sleep(1) + + print("\n[2] 开始测试...") + + results = {} + + try: + results["ANP"] = await test_anp() + results["A2A"] = await test_a2a() + results["AITP"] = await test_aitp() + results["Agent Protocol"] = await test_agent_protocol() + results["LMOS"] = await test_lmos() + finally: + print("\n[3] 停止测试服务器...") + stop_servers() + print(" ✅ 所有服务器已停止") + + print("\n" + "=" * 60) + print(" 测试汇总") + print("=" * 60) + + all_passed = True + for name, passed in results.items(): + status = "✅" if passed else "❌" + print(f" {status} {name}") + if not passed: + all_passed = False + + print() + if all_passed: + print("🎉 所有协议测试通过!") + else: + print("⚠️ 部分测试失败") + sys.exit(1) + + +if __name__ == "__main__": + try: + asyncio.run(main()) + except KeyboardInterrupt: + stop_servers() + print("\n测试中断") diff --git a/uni-agent/scripts/uni_cli.py b/uni-agent/scripts/uni_cli.py new file mode 100644 index 0000000..b577f01 --- /dev/null +++ b/uni-agent/scripts/uni_cli.py @@ -0,0 +1,257 @@ +#!/usr/bin/env python3 +""" +UniAgent CLI - 统一智能体协议调用工具 + +用法: + # 调用 Agent + python uni_cli.py call amap@anp maps_weather '{"city":"北京"}' + python uni_cli.py call filesystem@mcp read_file '{"path":"/tmp/a.txt"}' + + # 发现 Agent + python uni_cli.py discover weather + + # 列出已注册 Agent + python uni_cli.py list + + # 查看 Agent 方法 + python uni_cli.py methods amap@anp +""" + +import asyncio +import json +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).parent.parent)) + +import yaml +from adapters import get_adapter, ADAPTERS + + +CONFIG_DIR = Path(__file__).parent.parent / "config" + + +def load_agents_config(): + """加载 Agent 配置""" + agents_file = CONFIG_DIR / "agents.yaml" + if not agents_file.exists(): + return {"agents": []} + + with open(agents_file) as f: + return yaml.safe_load(f) or {"agents": []} + + +def parse_agent_id(agent_id: str) -> tuple: + """解析 Agent ID,返回 (name, protocol)""" + if "@" not in agent_id: + return agent_id, "anp" + + parts = agent_id.rsplit("@", 1) + return parts[0], parts[1] + + +def get_agent_config(agent_name: str, protocol: str) -> dict: + """获取 Agent 配置""" + config = load_agents_config() + + for agent in config.get("agents", []): + if agent.get("id") == agent_name and agent.get("protocol") == protocol: + return agent + + raise ValueError(f"未找到 Agent: {agent_name}@{protocol}") + + +async def call_agent(agent_id: str, method: str, params: dict): + """调用 Agent""" + agent_name, protocol = parse_agent_id(agent_id) + + print(f"协议: {protocol}") + print(f"Agent: {agent_name}") + print(f"方法: {method}") + print(f"参数: {json.dumps(params, ensure_ascii=False)}") + print() + + agent_config = get_agent_config(agent_name, protocol) + adapter = get_adapter(protocol) + + connection = await adapter.connect(agent_config) + + try: + result = await adapter.call(connection, method, params) + print("=== 结果 ===") + print(json.dumps(result, indent=2, ensure_ascii=False)) + finally: + await adapter.close(connection) + + +async def discover_agents(capability: str = ""): + """发现 Agent""" + print(f"搜索能力: {capability or '全部'}\n") + + all_agents = [] + + for protocol, adapter_class in ADAPTERS.items(): + adapter = adapter_class() + agents = await adapter.discover(capability) + all_agents.extend(agents) + + if not all_agents: + print("未找到匹配的 Agent") + return + + print(f"找到 {len(all_agents)} 个 Agent:\n") + for agent in all_agents: + print(f" {agent.id}") + print(f" 名称: {agent.name}") + print(f" 协议: {agent.protocol}") + if agent.endpoint: + print(f" 端点: {agent.endpoint[:60]}...") + print() + + +async def list_agents(): + """列出所有已注册 Agent""" + config = load_agents_config() + agents = config.get("agents", []) + + if not agents: + print("暂无已注册的 Agent") + print("请编辑 config/agents.yaml 添加 Agent") + return + + print(f"\n已注册的 Agent ({len(agents)} 个):\n") + + by_protocol = {} + for agent in agents: + protocol = agent.get("protocol", "unknown") + if protocol not in by_protocol: + by_protocol[protocol] = [] + by_protocol[protocol].append(agent) + + for protocol, protocol_agents in by_protocol.items(): + print(f"[{protocol.upper()}]") + for agent in protocol_agents: + agent_id = f"{agent['id']}@{protocol}" + name = agent.get("name", agent["id"]) + print(f" {agent_id}: {name}") + print() + + +async def show_methods(agent_id: str): + """显示 Agent 支持的方法""" + agent_name, protocol = parse_agent_id(agent_id) + + print(f"获取 {agent_name}@{protocol} 的方法列表...\n") + + agent_config = get_agent_config(agent_name, protocol) + adapter = get_adapter(protocol) + + connection = await adapter.connect(agent_config) + + try: + methods = await adapter.get_methods(connection) + + if not methods: + print("未获取到方法列表") + return + + print(f"可用方法 ({len(methods)} 个):\n") + for m in methods[:30]: + name = m.get("name", "unknown") + desc = m.get("description", "")[:50] + print(f" - {name}: {desc}") + + if len(methods) > 30: + print(f" ... 还有 {len(methods) - 30} 个方法") + finally: + await adapter.close(connection) + + +def show_help(): + print(""" +UniAgent - 统一智能体协议调用工具 + +用法: + python uni_cli.py <命令> [参数...] + +命令: + call 调用 Agent 方法 + discover [capability] 发现 Agent + list 列出已注册 Agent + methods 查看 Agent 方法 + +Agent ID 格式: + @ + + 示例: + - amap@anp ANP 协议的高德地图 + - filesystem@mcp MCP 协议的文件系统 + +支持的协议: + - anp ANP (Agent Network Protocol) - 去中心化 Agent 网络 + - mcp MCP (Model Context Protocol) - LLM 工具调用 + - a2a A2A (Agent-to-Agent) - Google Agent 协作 + - aitp AITP (Agent Interaction & Transaction) - 交互交易 + - agent_protocol Agent Protocol - REST API 标准 (别名: ap) + - lmos LMOS (Language Model OS) - 企业级 Agent 平台 + +示例: + # ANP - 查天气 + python uni_cli.py call amap@anp maps_weather '{"city":"北京"}' + + # MCP - 读文件 + python uni_cli.py call filesystem@mcp read_file '{"path":"/tmp/a.txt"}' + + # A2A - 发送任务 + python uni_cli.py call assistant@a2a tasks/send '{"message":{"content":"hello"}}' + + # AITP - 对话 + python uni_cli.py call shop@aitp message '{"content":"我要买咖啡"}' + + # Agent Protocol - 创建任务 + python uni_cli.py call autogpt@ap create_task '{"input":"写代码"}' + + # 发现 Agent + python uni_cli.py discover weather +""") + + +async def main(): + if len(sys.argv) < 2: + show_help() + return + + cmd = sys.argv[1] + + if cmd in ["help", "-h", "--help"]: + show_help() + + elif cmd == "call": + if len(sys.argv) < 5: + print("用法: python uni_cli.py call ''") + return + agent_id = sys.argv[2] + method = sys.argv[3] + params = json.loads(sys.argv[4]) + await call_agent(agent_id, method, params) + + elif cmd == "discover": + capability = sys.argv[2] if len(sys.argv) > 2 else "" + await discover_agents(capability) + + elif cmd == "list": + await list_agents() + + elif cmd == "methods": + if len(sys.argv) < 3: + print("用法: python uni_cli.py methods ") + return + await show_methods(sys.argv[2]) + + else: + print(f"未知命令: {cmd}") + show_help() + + +if __name__ == "__main__": + asyncio.run(main()) diff --git a/uni-agent/setup.sh b/uni-agent/setup.sh new file mode 100644 index 0000000..b88ce9e --- /dev/null +++ b/uni-agent/setup.sh @@ -0,0 +1,107 @@ +#!/bin/bash +# +# UniAgent 一键安装脚本 +# + +set -e + +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +CONFIG_DIR="$SCRIPT_DIR/config" + +echo "==========================================" +echo " UniAgent - 统一智能体协议适配层" +echo " Connect Any Agent, Any Protocol" +echo "==========================================" +echo "" + +# 1. 检查 Python +echo "[1/4] 检查 Python 环境..." +if ! command -v python3 &> /dev/null; then + echo "❌ 未找到 Python3,请先安装 Python 3.8+" + exit 1 +fi +PYTHON_VERSION=$(python3 -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")') +echo "✅ Python $PYTHON_VERSION" + +# 2. 安装依赖 +echo "" +echo "[2/4] 安装 Python 依赖..." +pip3 install -q pyyaml aiohttp anp --break-system-packages 2>/dev/null || pip3 install -q pyyaml aiohttp anp --user 2>/dev/null || pip3 install -q pyyaml aiohttp anp +echo "✅ 依赖安装完成" + +# 3. 检查/生成 DID 身份(用于 ANP) +echo "" +echo "[3/4] 配置 DID 身份 (ANP 协议)..." + +if [ -f "$CONFIG_DIR/did.json" ] && [ -f "$CONFIG_DIR/private-key.pem" ]; then + echo "✅ 已存在 DID 身份,跳过生成" + DID_ID=$(python3 -c "import json; print(json.load(open('$CONFIG_DIR/did.json'))['id'])" 2>/dev/null || echo "unknown") + echo " DID: $DID_ID" +else + echo "⚙️ 生成本地临时身份..." + + # 生成 secp256k1 私钥 + openssl ecparam -name secp256k1 -genkey -noout -out "$CONFIG_DIR/private-key.pem" 2>/dev/null + + # 生成随机 ID + RANDOM_ID=$(openssl rand -hex 8) + + # 创建 DID 文档 + cat > "$CONFIG_DIR/did.json" << EOF +{ + "@context": [ + "https://www.w3.org/ns/did/v1", + "https://w3id.org/security/suites/secp256k1-2019/v1" + ], + "id": "did:wba:local:user:$RANDOM_ID", + "verificationMethod": [ + { + "id": "did:wba:local:user:$RANDOM_ID#key-1", + "type": "EcdsaSecp256k1VerificationKey2019", + "controller": "did:wba:local:user:$RANDOM_ID" + } + ], + "authentication": [ + "did:wba:local:user:$RANDOM_ID#key-1" + ] +} +EOF + + echo "✅ 本地身份生成完成" + echo " DID: did:wba:local:user:$RANDOM_ID" +fi + +# 4. 验证安装 +echo "" +echo "[4/4] 验证安装..." +cd "$SCRIPT_DIR" +if python3 scripts/uni_cli.py list &> /dev/null; then + echo "✅ 安装成功!" +else + echo "⚠️ 安装可能有问题,请检查错误信息" +fi + +echo "" +echo "==========================================" +echo " 安装完成!" +echo "==========================================" +echo "" +echo "支持的协议:" +echo " - ANP (Agent Network Protocol) ✅ 已实现" +echo " - MCP (Model Context Protocol) ✅ 已实现" +echo " - A2A (Agent-to-Agent) ✅ 已实现" +echo " - AITP (Agent Interaction & Tx) ✅ 已实现" +echo " - AP (Agent Protocol) ✅ 已实现" +echo " - LMOS (Eclipse LMOS) ✅ 已实现" +echo "" +echo "快速开始:" +echo "" +echo " # 列出已注册 Agent" +echo " python scripts/uni_cli.py list" +echo "" +echo " # 调用 ANP Agent 查天气" +echo " python scripts/uni_cli.py call amap@anp maps_weather '{\"city\":\"北京\"}'" +echo "" +echo " # 查看 Agent 方法" +echo " python scripts/uni_cli.py methods amap@anp" +echo "" diff --git a/uni-agent/test_servers/a2a_server.py b/uni-agent/test_servers/a2a_server.py new file mode 100644 index 0000000..1330612 --- /dev/null +++ b/uni-agent/test_servers/a2a_server.py @@ -0,0 +1,165 @@ +#!/usr/bin/env python3 +""" +A2A 测试服务器 - 简单的 Echo Agent +HTTP 服务,提供 Agent Card 和 JSON-RPC 端点 +""" + +import json +import uuid +from http.server import HTTPServer, BaseHTTPRequestHandler +from urllib.parse import urlparse + +PORT = 8100 + + +class A2AHandler(BaseHTTPRequestHandler): + + tasks = {} + + def log_message(self, format, *args): + pass + + def send_json(self, data: dict, status: int = 200): + body = json.dumps(data, ensure_ascii=False).encode() + self.send_response(status) + self.send_header("Content-Type", "application/json") + self.send_header("Content-Length", len(body)) + self.end_headers() + self.wfile.write(body) + + def do_GET(self): + path = urlparse(self.path).path + + if path == "/.well-known/agent.json": + self.send_json({ + "name": "Test A2A Agent", + "description": "A simple echo agent for testing", + "url": f"http://localhost:{PORT}/rpc", + "version": "1.0.0", + "capabilities": { + "streaming": False, + "pushNotifications": False + }, + "skills": [ + { + "id": "echo", + "name": "Echo", + "description": "Echo back the message", + "inputSchema": { + "type": "object", + "properties": { + "message": {"type": "string"} + } + } + }, + { + "id": "greet", + "name": "Greet", + "description": "Greet the user", + "inputSchema": { + "type": "object", + "properties": { + "name": {"type": "string"} + } + } + } + ], + "authentication": { + "schemes": ["none"] + } + }) + else: + self.send_json({"error": "Not found"}, 404) + + def do_POST(self): + path = urlparse(self.path).path + + content_length = int(self.headers.get("Content-Length", 0)) + body = self.rfile.read(content_length) + + try: + request = json.loads(body) + except json.JSONDecodeError: + self.send_json({"error": "Invalid JSON"}, 400) + return + + if path == "/rpc": + self.handle_rpc(request) + else: + self.send_json({"error": "Not found"}, 404) + + def handle_rpc(self, request: dict): + method = request.get("method", "") + params = request.get("params", {}) + req_id = request.get("id", str(uuid.uuid4())) + + if method == "tasks/send": + task_id = params.get("id", str(uuid.uuid4())) + message = params.get("message", {}) + content = message.get("parts", [{}])[0].get("text", "") if "parts" in message else message.get("content", "") + + response_text = f"Echo: {content}" + + A2AHandler.tasks[task_id] = { + "id": task_id, + "status": {"state": "completed"}, + "history": [ + message, + {"role": "agent", "parts": [{"type": "text", "text": response_text}]} + ] + } + + self.send_json({ + "jsonrpc": "2.0", + "id": req_id, + "result": A2AHandler.tasks[task_id] + }) + + elif method == "tasks/get": + task_id = params.get("id", "") + if task_id in A2AHandler.tasks: + self.send_json({ + "jsonrpc": "2.0", + "id": req_id, + "result": A2AHandler.tasks[task_id] + }) + else: + self.send_json({ + "jsonrpc": "2.0", + "id": req_id, + "error": {"code": -32000, "message": "Task not found"} + }) + + elif method == "tasks/cancel": + task_id = params.get("id", "") + if task_id in A2AHandler.tasks: + A2AHandler.tasks[task_id]["status"]["state"] = "canceled" + self.send_json({ + "jsonrpc": "2.0", + "id": req_id, + "result": {"success": True} + }) + else: + self.send_json({ + "jsonrpc": "2.0", + "id": req_id, + "error": {"code": -32000, "message": "Task not found"} + }) + + else: + self.send_json({ + "jsonrpc": "2.0", + "id": req_id, + "error": {"code": -32601, "message": f"Unknown method: {method}"} + }) + + +def main(): + server = HTTPServer(("localhost", PORT), A2AHandler) + print(f"A2A Test Server running on http://localhost:{PORT}") + print(f"Agent Card: http://localhost:{PORT}/.well-known/agent.json") + server.serve_forever() + + +if __name__ == "__main__": + main() diff --git a/uni-agent/test_servers/agent_protocol_server.py b/uni-agent/test_servers/agent_protocol_server.py new file mode 100644 index 0000000..a021059 --- /dev/null +++ b/uni-agent/test_servers/agent_protocol_server.py @@ -0,0 +1,146 @@ +#!/usr/bin/env python3 +""" +Agent Protocol 测试服务器 +REST API 标准实现 +""" + +import json +import uuid +from http.server import HTTPServer, BaseHTTPRequestHandler +from urllib.parse import urlparse +from datetime import datetime + +PORT = 8102 + + +class APHandler(BaseHTTPRequestHandler): + + tasks = {} + + def log_message(self, format, *args): + pass + + def send_json(self, data: dict, status: int = 200): + body = json.dumps(data, ensure_ascii=False).encode() + self.send_response(status) + self.send_header("Content-Type", "application/json") + self.send_header("Content-Length", len(body)) + self.end_headers() + self.wfile.write(body) + + def do_GET(self): + path = urlparse(self.path).path + + if path == "/" or path == "/ap/v1": + self.send_json({ + "name": "Test Agent Protocol Server", + "version": "1.0.0", + "protocol_version": "v1" + }) + + elif path == "/ap/v1/agent/tasks": + self.send_json({ + "tasks": list(APHandler.tasks.values()) + }) + + elif path.startswith("/ap/v1/agent/tasks/"): + parts = path.split("/") + task_id = parts[5] if len(parts) > 5 else "" + + if "/artifacts" in path: + if task_id in APHandler.tasks: + self.send_json({ + "artifacts": APHandler.tasks[task_id].get("artifacts", []) + }) + else: + self.send_json({"error": "Task not found"}, 404) + + elif "/steps" in path: + if task_id in APHandler.tasks: + self.send_json({ + "steps": APHandler.tasks[task_id].get("steps", []) + }) + else: + self.send_json({"error": "Task not found"}, 404) + + else: + if task_id in APHandler.tasks: + self.send_json(APHandler.tasks[task_id]) + else: + self.send_json({"error": "Task not found"}, 404) + + else: + self.send_json({"error": "Not found"}, 404) + + def do_POST(self): + path = urlparse(self.path).path + + content_length = int(self.headers.get("Content-Length", 0)) + body = self.rfile.read(content_length) + + try: + request = json.loads(body) if body else {} + except json.JSONDecodeError: + self.send_json({"error": "Invalid JSON"}, 400) + return + + if path == "/ap/v1/agent/tasks": + task_id = str(uuid.uuid4()) + task_input = request.get("input", "") + + APHandler.tasks[task_id] = { + "task_id": task_id, + "input": task_input, + "status": "running", + "steps": [], + "artifacts": [], + "created_at": datetime.now().isoformat() + } + + self.send_json(APHandler.tasks[task_id], 201) + + elif path.startswith("/ap/v1/agent/tasks/") and path.endswith("/steps"): + parts = path.split("/") + task_id = parts[5] + + if task_id not in APHandler.tasks: + self.send_json({"error": "Task not found"}, 404) + return + + step_input = request.get("input", "") + step_id = str(uuid.uuid4()) + + step = { + "step_id": step_id, + "input": step_input, + "output": f"Processed: {step_input}" if step_input else "Step executed", + "status": "completed", + "is_last": True, + "created_at": datetime.now().isoformat() + } + + APHandler.tasks[task_id]["steps"].append(step) + APHandler.tasks[task_id]["status"] = "completed" + + APHandler.tasks[task_id]["artifacts"].append({ + "artifact_id": str(uuid.uuid4()), + "file_name": "output.txt", + "relative_path": "/output.txt", + "content": step["output"] + }) + + self.send_json(step, 201) + + else: + self.send_json({"error": "Not found"}, 404) + + +def main(): + server = HTTPServer(("localhost", PORT), APHandler) + print(f"Agent Protocol Test Server running on http://localhost:{PORT}") + print(f"API Base: http://localhost:{PORT}/ap/v1") + server.serve_forever() + + +if __name__ == "__main__": + main() diff --git a/uni-agent/test_servers/aitp_server.py b/uni-agent/test_servers/aitp_server.py new file mode 100644 index 0000000..5ed64c1 --- /dev/null +++ b/uni-agent/test_servers/aitp_server.py @@ -0,0 +1,160 @@ +#!/usr/bin/env python3 +""" +AITP 测试服务器 - 模拟交互与交易 +HTTP 服务,支持 Thread 会话 +""" + +import json +import uuid +from http.server import HTTPServer, BaseHTTPRequestHandler +from urllib.parse import urlparse +from datetime import datetime + +PORT = 8101 + + +class AITPHandler(BaseHTTPRequestHandler): + + threads = {} + + def log_message(self, format, *args): + pass + + def send_json(self, data: dict, status: int = 200): + body = json.dumps(data, ensure_ascii=False).encode() + self.send_response(status) + self.send_header("Content-Type", "application/json") + self.send_header("Content-Length", len(body)) + self.end_headers() + self.wfile.write(body) + + def do_GET(self): + path = urlparse(self.path).path + + if path == "/": + self.send_json({ + "name": "Test AITP Agent", + "description": "A simple AITP agent for testing", + "version": "1.0.0", + "capabilities": ["aitp-01", "aitp-02", "aitp-03"] + }) + + elif path.startswith("/threads/"): + parts = path.split("/") + if len(parts) >= 3: + thread_id = parts[2] + if thread_id in AITPHandler.threads: + self.send_json(AITPHandler.threads[thread_id]) + else: + self.send_json({"error": "Thread not found"}, 404) + else: + self.send_json({"error": "Not found"}, 404) + + def do_POST(self): + path = urlparse(self.path).path + + content_length = int(self.headers.get("Content-Length", 0)) + body = self.rfile.read(content_length) + + try: + request = json.loads(body) + except json.JSONDecodeError: + self.send_json({"error": "Invalid JSON"}, 400) + return + + if path == "/threads": + thread_id = str(uuid.uuid4()) + AITPHandler.threads[thread_id] = { + "id": thread_id, + "status": "open", + "messages": [], + "created_at": datetime.now().isoformat() + } + self.send_json({"thread_id": thread_id}) + + elif path.startswith("/threads/") and path.endswith("/messages"): + parts = path.split("/") + thread_id = parts[2] + + if thread_id not in AITPHandler.threads: + AITPHandler.threads[thread_id] = { + "id": thread_id, + "status": "open", + "messages": [], + "created_at": datetime.now().isoformat() + } + + thread = AITPHandler.threads[thread_id] + + if "capability" in request: + capability = request.get("capability") + + if capability == "aitp-01": + payment_req = request.get("payment_request", {}) + response = { + "role": "agent", + "capability": "aitp-01", + "payment_response": { + "status": "approved", + "transaction_id": str(uuid.uuid4()), + "amount": payment_req.get("amount"), + "currency": payment_req.get("currency", "NEAR"), + "timestamp": datetime.now().isoformat() + } + } + + elif capability == "aitp-02": + decision_req = request.get("decision_request", {}) + response = { + "role": "agent", + "capability": "aitp-02", + "decision_response": { + "question": decision_req.get("question"), + "selected": decision_req.get("options", ["Yes"])[0] if decision_req.get("options") else "Yes" + } + } + + elif capability == "aitp-03": + data_req = request.get("data_request", {}) + response = { + "role": "agent", + "capability": "aitp-03", + "data_response": { + "schema": data_req.get("schema", {}), + "data": {"sample": "test_data", "timestamp": datetime.now().isoformat()} + } + } + + else: + response = { + "role": "agent", + "error": f"Unknown capability: {capability}" + } + + else: + message = request.get("message", {}) + content = message.get("content", "") + + response = { + "role": "agent", + "content": f"AITP Echo: {content}", + "timestamp": datetime.now().isoformat() + } + + thread["messages"].append(request) + thread["messages"].append(response) + + self.send_json(response) + + else: + self.send_json({"error": "Not found"}, 404) + + +def main(): + server = HTTPServer(("localhost", PORT), AITPHandler) + print(f"AITP Test Server running on http://localhost:{PORT}") + server.serve_forever() + + +if __name__ == "__main__": + main() diff --git a/uni-agent/test_servers/lmos_server.py b/uni-agent/test_servers/lmos_server.py new file mode 100644 index 0000000..5f59c81 --- /dev/null +++ b/uni-agent/test_servers/lmos_server.py @@ -0,0 +1,169 @@ +#!/usr/bin/env python3 +""" +LMOS 测试服务器 - 模拟企业级 Agent 平台 +包含注册中心和 Agent 能力调用 +""" + +import json +import uuid +from http.server import HTTPServer, BaseHTTPRequestHandler +from urllib.parse import urlparse, parse_qs +from datetime import datetime + +PORT = 8103 + + +MOCK_AGENTS = [ + { + "id": "calculator", + "name": "Calculator Agent", + "description": "Performs calculations", + "endpoint": f"http://localhost:{PORT}/agents/calculator", + "capabilities": [ + {"id": "add", "description": "Add two numbers"}, + {"id": "multiply", "description": "Multiply two numbers"} + ] + }, + { + "id": "greeter", + "name": "Greeter Agent", + "description": "Greets users", + "endpoint": f"http://localhost:{PORT}/agents/greeter", + "capabilities": [ + {"id": "greet", "description": "Greet a user by name"} + ] + } +] + + +class LMOSHandler(BaseHTTPRequestHandler): + + def log_message(self, format, *args): + pass + + def send_json(self, data: dict, status: int = 200): + body = json.dumps(data, ensure_ascii=False).encode() + self.send_response(status) + self.send_header("Content-Type", "application/json") + self.send_header("Content-Length", len(body)) + self.end_headers() + self.wfile.write(body) + + def do_GET(self): + parsed = urlparse(self.path) + path = parsed.path + query = parse_qs(parsed.query) + + if path == "/": + self.send_json({ + "name": "Test LMOS Registry", + "version": "1.0.0", + "agents": len(MOCK_AGENTS) + }) + + elif path == "/agents": + capability = query.get("capability", [None])[0] + + if capability: + filtered = [ + a for a in MOCK_AGENTS + if any(c["id"] == capability for c in a["capabilities"]) + ] + self.send_json({"agents": filtered}) + else: + self.send_json({"agents": MOCK_AGENTS}) + + elif path.startswith("/agents/") and path.endswith("/capabilities"): + agent_id = path.split("/")[2] + agent = next((a for a in MOCK_AGENTS if a["id"] == agent_id), None) + + if agent: + self.send_json({"capabilities": agent["capabilities"]}) + else: + self.send_json({"error": "Agent not found"}, 404) + + else: + self.send_json({"error": "Not found"}, 404) + + def do_POST(self): + parsed = urlparse(self.path) + path = parsed.path + + content_length = int(self.headers.get("Content-Length", 0)) + body = self.rfile.read(content_length) + + try: + request = json.loads(body) if body else {} + except json.JSONDecodeError: + self.send_json({"error": "Invalid JSON"}, 400) + return + + if path.startswith("/agents/") and path.endswith("/invoke"): + agent_id = path.split("/")[2] + agent = next((a for a in MOCK_AGENTS if a["id"] == agent_id), None) + + if not agent: + self.send_json({"error": "Agent not found"}, 404) + return + + capability = request.get("capability", "") + input_data = request.get("input", {}) + + if agent_id == "calculator": + if capability == "add": + a = input_data.get("a", 0) + b = input_data.get("b", 0) + result = {"result": a + b} + elif capability == "multiply": + a = input_data.get("a", 0) + b = input_data.get("b", 0) + result = {"result": a * b} + else: + result = {"error": f"Unknown capability: {capability}"} + + elif agent_id == "greeter": + if capability == "greet": + name = input_data.get("name", "World") + result = {"greeting": f"Hello, {name}!"} + else: + result = {"error": f"Unknown capability: {capability}"} + + else: + result = {"error": "Unknown agent"} + + self.send_json({ + "agent_id": agent_id, + "capability": capability, + "output": result, + "timestamp": datetime.now().isoformat() + }) + + elif path == "/route": + query_text = request.get("query", "") + + if "add" in query_text.lower() or "calculate" in query_text.lower(): + best_agent = MOCK_AGENTS[0] + elif "greet" in query_text.lower() or "hello" in query_text.lower(): + best_agent = MOCK_AGENTS[1] + else: + best_agent = MOCK_AGENTS[0] + + self.send_json({ + "recommended_agent": best_agent, + "confidence": 0.85, + "alternatives": [a for a in MOCK_AGENTS if a["id"] != best_agent["id"]] + }) + + else: + self.send_json({"error": "Not found"}, 404) + + +def main(): + server = HTTPServer(("localhost", PORT), LMOSHandler) + print(f"LMOS Test Server running on http://localhost:{PORT}") + print(f"Registry: http://localhost:{PORT}/agents") + server.serve_forever() + + +if __name__ == "__main__": + main() diff --git a/uni-agent/test_servers/mcp_server.py b/uni-agent/test_servers/mcp_server.py new file mode 100644 index 0000000..7163a04 --- /dev/null +++ b/uni-agent/test_servers/mcp_server.py @@ -0,0 +1,161 @@ +#!/usr/bin/env python3 +""" +MCP 测试服务器 - 简单的 Echo + 计算器 +通过 stdio 通信 +""" + +import json +import sys +from datetime import datetime + + +def send_response(id: str, result: dict): + """发送 JSON-RPC 响应""" + response = { + "jsonrpc": "2.0", + "id": id, + "result": result + } + msg = json.dumps(response) + sys.stdout.write(f"Content-Length: {len(msg)}\r\n\r\n{msg}") + sys.stdout.flush() + + +def send_error(id: str, code: int, message: str): + """发送错误响应""" + response = { + "jsonrpc": "2.0", + "id": id, + "error": {"code": code, "message": message} + } + msg = json.dumps(response) + sys.stdout.write(f"Content-Length: {len(msg)}\r\n\r\n{msg}") + sys.stdout.flush() + + +def handle_request(request: dict): + """处理请求""" + method = request.get("method", "") + params = request.get("params", {}) + req_id = request.get("id", "0") + + if method == "initialize": + send_response(req_id, { + "protocolVersion": "2024-11-05", + "capabilities": { + "tools": {"listChanged": True} + }, + "serverInfo": { + "name": "test-mcp-server", + "version": "1.0.0" + } + }) + + elif method == "notifications/initialized": + pass + + elif method == "tools/list": + send_response(req_id, { + "tools": [ + { + "name": "echo", + "description": "返回输入的消息", + "inputSchema": { + "type": "object", + "properties": { + "message": {"type": "string", "description": "要返回的消息"} + }, + "required": ["message"] + } + }, + { + "name": "add", + "description": "两数相加", + "inputSchema": { + "type": "object", + "properties": { + "a": {"type": "number"}, + "b": {"type": "number"} + }, + "required": ["a", "b"] + } + }, + { + "name": "get_time", + "description": "获取当前时间", + "inputSchema": {"type": "object", "properties": {}} + } + ] + }) + + elif method == "tools/call": + tool_name = params.get("name", "") + tool_args = params.get("arguments", {}) + + if tool_name == "echo": + msg = tool_args.get("message", "") + send_response(req_id, { + "content": [{"type": "text", "text": f"Echo: {msg}"}] + }) + + elif tool_name == "add": + a = tool_args.get("a", 0) + b = tool_args.get("b", 0) + send_response(req_id, { + "content": [{"type": "text", "text": str(a + b)}] + }) + + elif tool_name == "get_time": + now = datetime.now().strftime("%Y-%m-%d %H:%M:%S") + send_response(req_id, { + "content": [{"type": "text", "text": now}] + }) + + else: + send_error(req_id, -32601, f"Unknown tool: {tool_name}") + + else: + send_error(req_id, -32601, f"Unknown method: {method}") + + +def main(): + """主循环 - 读取 stdin,处理请求""" + buffer = "" + + while True: + try: + line = sys.stdin.readline() + if not line: + break + + buffer += line + + if "Content-Length:" in buffer: + parts = buffer.split("\r\n\r\n", 1) + if len(parts) == 2: + header, body = parts + length = int(header.split(":")[1].strip()) + + while len(body) < length: + body += sys.stdin.read(length - len(body)) + + request = json.loads(body[:length]) + handle_request(request) + + buffer = body[length:] + + elif buffer.strip().startswith("{"): + try: + request = json.loads(buffer.strip()) + handle_request(request) + buffer = "" + except json.JSONDecodeError: + pass + + except Exception as e: + sys.stderr.write(f"Error: {e}\n") + sys.stderr.flush() + + +if __name__ == "__main__": + main() diff --git a/video-creator/README.md b/video-creator/README.md new file mode 100644 index 0000000..58cb2a9 --- /dev/null +++ b/video-creator/README.md @@ -0,0 +1,24 @@ +# Video Creator + +视频生成技能,图片+音频合成视频。 + +## 依赖 + +```bash +brew install ffmpeg +pip install edge-tts pyyaml +``` + +## 功能 + +- 图片序列 + 音频 → 视频 +- 淡入淡出转场 +- 自动拼接片尾 +- 添加 BGM +- 烧录字幕 + +## 资源 + +- 片尾视频:支持 9 种比例(1x1/9x16/16x9 等) +- BGM:科技感/史诗感 +- Logo diff --git a/video-creator/SKILL.md b/video-creator/SKILL.md new file mode 100644 index 0000000..d4ebf4e --- /dev/null +++ b/video-creator/SKILL.md @@ -0,0 +1,316 @@ +--- +name: video-creator +description: 视频创作技能。图片+音频合成视频,支持淡入淡出转场、自动拼接片尾、添加BGM。当用户提到「生成视频」「图文转视频」「做视频号」时触发此技能。 +--- + +# Video Creator + +图片+音频合成视频工具。 + +## 核心流程(铁律) + +### 故事类视频生成流程(套娃流程) + +当用户提供故事/剧情/剧本时,**必须严格按以下套娃流程执行**: + +``` +┌─────────────────────────────────────────────────────────────┐ +│ 第一层:故事 → 拆分场景 → 并发生成场景主图(文生图) │ +│ │ +│ 大闹天宫 → 场景1:弼马温受辱 │ +│ 场景2:筋斗云回花果山 │ +│ 场景3:玉帝派兵 │ +│ ... │ +│ → 并发调用 text_to_image.py 生成每个场景主图 │ +└─────────────────────────────────────────────────────────────┘ + ↓ +┌─────────────────────────────────────────────────────────────┐ +│ 第二层:每个场景主图 → 图生图拆出细镜头(保持角色一致) │ +│ │ +│ 场景1主图 → 细镜头1:悟空看官印疑惑 │ +│ 细镜头2:悟空踢翻马槽 │ +│ 场景2主图 → 细镜头1:踏筋斗云腾空 │ +│ 细镜头2:花果山自封大圣 │ +│ → 并发调用 image_to_image.py,以主图为参考 │ +└─────────────────────────────────────────────────────────────┘ + ↓ +┌─────────────────────────────────────────────────────────────┐ +│ 第三层:生成配音 + 字幕 + 合成视频 │ +│ │ +│ 1. tts_generator.py 生成配音 + 时间戳 │ +│ 2. 【铁律】根据时间戳精确计算每张图的duration(见下方规范) │ +│ 3. 生成 SRT 字幕 │ +│ 4. 生成 video_config.yaml 前必须校验总时长 │ +│ 5. video_maker.py 合成: │ +│ → 图片合成(带转场) │ +│ → 合并音频 │ +│ → 烧录字幕(ASS格式,底部居中固定) │ +│ → 自动拼接片尾(二维码+"点关注不迷路") │ +│ → 添加BGM │ +└─────────────────────────────────────────────────────────────┘ + +**铁律:所有视频必须自动拼接片尾!** +``` + +### 目录结构规范 + +``` +assets/generated/{project_name}/ +├── scene1/ +│ ├── main.png # 场景1主图(文生图) +│ ├── shot_01.png # 细镜头1(图生图) +│ └── shot_02.png # 细镜头2(图生图) +├── scene2/ +│ ├── main.png +│ ├── shot_01.png +│ └── shot_02.png +├── ... +├── narration.mp3 # 配音 +├── narration.json # 时间戳 +├── subtitles.srt # 字幕 +├── video_config.yaml # 视频配置 +└── {project_name}.mp4 # 最终视频 +``` + +### 执行命令示例 + +```bash +# 第一层:并发生成场景主图 +python .opencode/skills/image-service/scripts/text_to_image.py "风格描述,场景1内容" -r 9:16 -o scene1/main.png & +python .opencode/skills/image-service/scripts/text_to_image.py "风格描述,场景2内容" -r 9:16 -o scene2/main.png & +wait + +# 第二层:并发图生图生成细镜头 +python .opencode/skills/image-service/scripts/image_to_image.py scene1/main.png "保持角色风格,细镜头描述" -r 9:16 -o scene1/shot_01.png & +python .opencode/skills/image-service/scripts/image_to_image.py scene1/main.png "保持角色风格,细镜头描述" -r 9:16 -o scene1/shot_02.png & +wait + +# 第三层:生成配音+合成视频 +python .opencode/skills/video-creator/scripts/tts_generator.py --text "完整旁白" --output narration.mp3 --timestamps +python .opencode/skills/video-creator/scripts/video_maker.py video_config.yaml --srt subtitles.srt --bgm epic +``` + +--- + +## 视频配置文件格式 + +```yaml +# video_config.yaml +ratio: "9:16" # 必须加引号!避免YAML解析错误 +bgm_volume: 0.12 +outro: true + +scenes: + - audio: narration.mp3 + images: + # 按场景顺序排列所有细镜头 + - file: scene1/shot_01.png + duration: 4.34 + - file: scene1/shot_02.png + duration: 4.88 + - file: scene2/shot_01.png + duration: 2.15 + # ... +``` + +**注意**:`ratio` 必须用引号包裹,如 `"9:16"`,否则 YAML 会解析成时间格式。 + +--- + +## 时长分配规范(铁律!) + +**生成 video_config.yaml 前,必须严格按以下流程计算 duration:** + +### 步骤1:读取时间戳文件 + +```python +import json +with open("narration.json", "r") as f: + timestamps = json.load(f) +audio_duration = timestamps[-1]["end"] +print(f"音频总时长: {audio_duration:.1f}s") +``` + +### 步骤2:按内容语义划分场景 + +根据解说词内容,确定每张图对应的时间段: + +```python +# 示例:根据解说词内容划分 +# 找到每个主题切换点的时间戳 +scenes = [ + ("cover.png", 0, 12.5), # 开场到第一个主题切换 + ("scene01.png", 12.5, 26), # 第二段内容 + # ...根据 narration.json 中的句子边界精确划分 +] +``` + +### 步骤3:计算每张图的 duration + +```python +for file, start, end in scenes: + duration = end - start + print(f"{file}: {duration:.1f}s") +``` + +### 步骤4:校验总时长 + +```python +total_duration = sum(duration for _, _, duration in scenes) +assert abs(total_duration - audio_duration) < 1.0, \ + f"时长不匹配!图片总时长{total_duration}s vs 音频{audio_duration}s" +``` + +### 铁律 + +1. **必须先读取 narration.json 时间戳**,不能凭感觉估算 +2. **按句子语义边界划分**,不能平均分配 +3. **生成配置前必须校验**,确保图片总时长 ≈ 音频总时长(误差<1秒) +4. **禁止让脚本自动拉伸**,音画不同步的视频不合格 + +### 时长分配表模板 + +生成配置前,先输出分配表让用户确认: + +```markdown +| 场景图 | 对应内容 | 开始 | 结束 | 时长 | +|--------|----------|------|------|------| +| cover.png | 开场引入 | 0s | 12.5s | 12.5s | +| scene01.png | AI Agent时代 | 12.5s | 26s | 13.5s | +| ... | ... | ... | ... | ... | +| **合计** | | | | **{total}s** | + +音频总时长:{audio_duration}s +差值:{diff}s ✅/❌ +``` + +--- + +## 字幕规范 + +字幕使用 ASS 格式,**强制底部居中固定位置**: + +- 位置:底部居中(Alignment=2) +- 字体:PingFang SC +- 大小:屏幕高度 / 40 +- 描边:2px 黑色描边 + 1px 阴影 +- 底边距:屏幕高度 / 20 + +**禁止**:字幕乱跑、大小不一、位置不固定 + +--- + +## 脚本参数说明 + +### video_maker.py + +```bash +python video_maker.py config.yaml [options] +``` + +| 参数 | 说明 | 默认值 | +|------|------|--------| +| `--no-outro` | 不添加片尾 | 添加 | +| `--no-bgm` | 不添加BGM | 添加 | +| `--fade` | 转场时长(秒) | 0.5 | +| `--bgm-volume` | BGM音量 | 0.08 | +| `--bgm` | 自定义BGM(可选: epic) | 默认科技风 | +| `--ratio` | 视频比例 | 16:9(会被配置文件覆盖) | +| `--srt` | 字幕文件路径 | 无 | + +### tts_generator.py + +```bash +python tts_generator.py --text "文本" --output audio.mp3 [options] +``` + +| 参数 | 说明 | 默认值 | +|------|------|--------| +| `--voice` | 音色 | zh-CN-YunxiNeural | +| `--rate` | 语速 | +0% | +| `--timestamps` | 输出时间戳JSON | 否 | + +--- + +## 支持的视频比例 + +与 `image-service` 生图服务保持一致,支持 **10 种比例**: + +| 比例 | 分辨率 | 适用场景 | +|------|--------|----------| +| 1:1 | 1024×1024 | 正方形,朋友圈 | +| 2:3 | 832×1248 | 竖版海报 | +| 3:2 | 1248×832 | 横版海报 | +| 3:4 | 1080×1440 | 小红书、朋友圈 | +| 4:3 | 1440×1080 | 传统显示器 | +| 4:5 | 864×1080 | Instagram | +| 5:4 | 1080×864 | 横版照片 | +| 9:16 | 1080×1920 | 抖音、视频号、竖屏 | +| 16:9 | 1920×1080 | B站、YouTube、横屏 | +| 21:9 | 1536×672 | 超宽屏电影 | + +--- + +## 片尾规范 + +**铁律:所有视频必须自动拼接对应尺寸的片尾!** + +片尾匹配顺序: +1. 精确匹配:`outro_{ratio}.mp4` +2. 方向匹配:竖版→`outro_9x16.mp4`,横版→`outro_16x9.mp4` +3. 兜底:`outro.mp4` + +--- + +## BGM 资源 + +| 文件 | 风格 | 适用场景 | +|------|------|----------| +| `bgm_technology.mp3` | 科技感 | 技术教程、产品介绍 | +| `bgm_epic.mp3` | 热血史诗 | 故事、战斗、励志 | + +使用:`--bgm epic` 或 `--bgm /path/to/bgm.mp3` + +--- + +## 常用音色 + +| 音色 ID | 风格 | +|---------|------| +| zh-CN-YunyangNeural | 男声,新闻播报 | +| zh-CN-YunxiNeural | 男声,阳光活泼 | +| zh-CN-XiaoxiaoNeural | 女声,温暖自然 | +| zh-CN-XiaoyiNeural | 女声,活泼可爱 | + +--- + +## 目录结构 + +``` +video-creator/ +├── SKILL.md +├── scripts/ +│ ├── video_maker.py # 主脚本:图片+音频→视频 +│ ├── tts_generator.py # TTS 语音生成 +│ └── scene_splitter.py # 场景拆分器(可选) +├── assets/ +│ ├── outro.mp4 # 通用片尾(16:9) +│ ├── outro_9x16.mp4 # 竖版片尾 +│ ├── outro_3x4.mp4 # 3:4片尾 +│ ├── bgm_technology.mp3 # 默认BGM +│ └── bgm_epic.mp3 # 热血BGM +└── references/ + └── edge_tts_voices.md +``` + +--- + +## 依赖 + +```bash +# 系统依赖 +brew install ffmpeg # Mac + +# Python 依赖 +pip install edge-tts pyyaml +``` diff --git a/video-creator/assets/.gitkeep b/video-creator/assets/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/video-creator/assets/bgm_epic.mp3 b/video-creator/assets/bgm_epic.mp3 new file mode 100644 index 0000000..fca7813 Binary files /dev/null and b/video-creator/assets/bgm_epic.mp3 differ diff --git a/video-creator/assets/bgm_technology.mp3 b/video-creator/assets/bgm_technology.mp3 new file mode 100644 index 0000000..044a977 Binary files /dev/null and b/video-creator/assets/bgm_technology.mp3 differ diff --git a/video-creator/assets/default_config.yaml b/video-creator/assets/default_config.yaml new file mode 100644 index 0000000..193a7b0 --- /dev/null +++ b/video-creator/assets/default_config.yaml @@ -0,0 +1,73 @@ +# 默认配置 - 所有项目继承此配置 +# 项目配置只需写差异部分即可 + +# 视频基础配置 +resolution: [1080, 1920] # 默认竖版 +fps: 30 + +# 语音默认配置(项目可覆盖) +voice: + name: "zh-CN-YunxiNeural" # 默认男声 + rate: "+0%" + pitch: "+0Hz" + +# 可选音色参考: +# 男声: zh-CN-YunxiNeural, zh-CN-YunyangNeural, zh-CN-YunjianNeural +# 女声: zh-CN-XiaoxiaoNeural, zh-CN-XiaoyiNeural, zh-CN-XiaohanNeural +# 方言: zh-CN-liaoning-XiaobeiNeural, zh-CN-shaanxi-XiaoniNeural + +# 样式默认配置 +style: + font: "PingFang SC" + font_size: 42 + font_color: "#FFFFFF" + stroke_color: "#000000" + stroke_width: 2 + subtitle_position: "bottom" # bottom/top/center + subtitle_bg: true + subtitle_bg_color: "#000000" + subtitle_bg_opacity: 0.7 + highlight_color: "#FFD700" + +# 动画默认配置 +animation: + ken_burns: true + default_animation: "zoom_in" + animation_intensity: 0.15 # 动画幅度 0.1~0.3 + +# 可选动画效果: +# 缩放: zoom_in, zoom_out +# 平移: pan_left, pan_right, pan_up, pan_down +# 组合: zoom_pan_left, zoom_pan_right, zoom_pan_up, zoom_pan_down + +# 转场默认配置 +transition: + type: "fade" + duration: 0.5 + +# 可选转场效果: +# 基础: fade, dissolve +# 滑动: slide_left, slide_right, slide_up, slide_down +# 特效: zoom_blur, wipe_left, wipe_right + +# 片头默认配置 +intro: + enabled: true + duration: 3 + title_animation: "scale_in" # scale_in/fade_in/typewriter + background_color: "#0d1117" + title_color: "#FFFFFF" + +# 片尾默认配置 +outro: + enabled: true + duration: 4 + logo: "assets/logo.jpg" + logo_animation: "scale_in" + background_color: "#0d1117" + text: "点点关注,学习更多的AI知识" + text_color: "#CCCCCC" + +# 背景音乐(项目可配置) +# background_music: "path/to/music.mp3" +# music_volume: 0.1 # 0.0~1.0 diff --git a/video-creator/assets/example_config.yaml b/video-creator/assets/example_config.yaml new file mode 100644 index 0000000..46972d3 --- /dev/null +++ b/video-creator/assets/example_config.yaml @@ -0,0 +1,73 @@ +# 视频创作器 - 示例配置 +# 使用方法: python scripts/video_generator.py assets/example_config.yaml + +# 视频基础配置 +title: "AI 前沿速递" +output: "output/ai_news.mp4" +resolution: [1920, 1080] # 横版 16:9 + +# 语音配置 +voice: + name: "zh-CN-YunxiNeural" # 男声新闻风 + rate: "+0%" + pitch: "+0Hz" + +# 样式配置 +style: + font: "PingFang SC" + font_size: 48 + font_color: "#FFFFFF" + stroke_color: "#000000" + stroke_width: 2 + subtitle_position: "bottom" + subtitle_bg: true + subtitle_bg_color: "#000000" + subtitle_bg_opacity: 0.6 + +# 动画配置 +animation: + ken_burns: true + default_animation: "zoom_in" + animation_intensity: 0.1 + +# 转场配置 +transition: + type: "fade" + duration: 0.5 + +# 片头配置 +intro: + enabled: true + duration: 3 + title_animation: "fade_up" + background_color: "#1a1a2e" + title_color: "#FFFFFF" + +# 片尾配置 +outro: + enabled: true + duration: 3 + logo: "assets/logo.jpg" # 微信公众号 Logo + logo_animation: "bounce_in" + background_color: "#1a1a2e" + text: "点点关注,学习更多的AI知识" + text_color: "#CCCCCC" + +# 场景列表 +scenes: + - image: "assets/scene_01.png" + text: "今天我们来聊聊 AI Agent 的最新进展" + animation: "zoom_in" + + - image: "assets/scene_02.png" + text: "首先是 OpenAI 发布的 Operator" + animation: "pan_right" + highlight: ["Operator"] + + - image: "assets/scene_03.png" + text: "它能自动操作浏览器完成复杂任务" + animation: "zoom_out" + + - image: "assets/scene_04.png" + text: "这意味着 AI 正在从对话走向行动" + animation: "pan_left" diff --git a/video-creator/assets/logo.jpg b/video-creator/assets/logo.jpg new file mode 100644 index 0000000..da4c0ae Binary files /dev/null and b/video-creator/assets/logo.jpg differ diff --git a/video-creator/assets/media/texts/25a546dbcb230f7f.svg b/video-creator/assets/media/texts/25a546dbcb230f7f.svg new file mode 100644 index 0000000..1e1b5a7 --- /dev/null +++ b/video-creator/assets/media/texts/25a546dbcb230f7f.svg @@ -0,0 +1,54 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/video-creator/assets/media/texts/60350aa7fd09283e.svg b/video-creator/assets/media/texts/60350aa7fd09283e.svg new file mode 100644 index 0000000..e4ef69a --- /dev/null +++ b/video-creator/assets/media/texts/60350aa7fd09283e.svg @@ -0,0 +1,54 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/video-creator/assets/media/texts/6bc75dd1367e9f40.svg b/video-creator/assets/media/texts/6bc75dd1367e9f40.svg new file mode 100644 index 0000000..209d0cf --- /dev/null +++ b/video-creator/assets/media/texts/6bc75dd1367e9f40.svg @@ -0,0 +1,54 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/video-creator/assets/media/texts/94ee77f0d4c0bc26.svg b/video-creator/assets/media/texts/94ee77f0d4c0bc26.svg new file mode 100644 index 0000000..f0b672e --- /dev/null +++ b/video-creator/assets/media/texts/94ee77f0d4c0bc26.svg @@ -0,0 +1,54 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/video-creator/assets/media/texts/9fded5d4bc94afcd.svg b/video-creator/assets/media/texts/9fded5d4bc94afcd.svg new file mode 100644 index 0000000..2a57a67 --- /dev/null +++ b/video-creator/assets/media/texts/9fded5d4bc94afcd.svg @@ -0,0 +1,54 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/video-creator/assets/media/texts/a47357e721583a39.svg b/video-creator/assets/media/texts/a47357e721583a39.svg new file mode 100644 index 0000000..b7f70a8 --- /dev/null +++ b/video-creator/assets/media/texts/a47357e721583a39.svg @@ -0,0 +1,54 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/2933822066_4135568850_2738303351.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/2933822066_4135568850_2738303351.mp4 new file mode 100644 index 0000000..bf8a05d Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/2933822066_4135568850_2738303351.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/2933822066_4277570474_1431002730.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/2933822066_4277570474_1431002730.mp4 new file mode 100644 index 0000000..7ebfc25 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/2933822066_4277570474_1431002730.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/2933822066_887646692_2952798786.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/2933822066_887646692_2952798786.mp4 new file mode 100644 index 0000000..47a2838 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/2933822066_887646692_2952798786.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/498540077_3316948809_223132457.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/498540077_3316948809_223132457.mp4 new file mode 100644 index 0000000..a8b0aeb Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation1x1/498540077_3316948809_223132457.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/2933822066_2031271638_2967863776.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/2933822066_2031271638_2967863776.mp4 new file mode 100644 index 0000000..2468327 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/2933822066_2031271638_2967863776.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/2933822066_3179084452_3934815855.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/2933822066_3179084452_3934815855.mp4 new file mode 100644 index 0000000..337a80e Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/2933822066_3179084452_3934815855.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/2933822066_4277570474_2501621613.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/2933822066_4277570474_2501621613.mp4 new file mode 100644 index 0000000..2e4bdbc Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/2933822066_4277570474_2501621613.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/498540077_2060609914_223132457.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/498540077_2060609914_223132457.mp4 new file mode 100644 index 0000000..c8e4ee6 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation21x9/498540077_2060609914_223132457.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/2933822066_1666630342_2001822528.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/2933822066_1666630342_2001822528.mp4 new file mode 100644 index 0000000..229ed13 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/2933822066_1666630342_2001822528.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/2933822066_4277570474_1204793078.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/2933822066_4277570474_1204793078.mp4 new file mode 100644 index 0000000..9215f27 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/2933822066_4277570474_1204793078.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/2933822066_7843185_1641679825.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/2933822066_7843185_1641679825.mp4 new file mode 100644 index 0000000..2dc1a29 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/2933822066_7843185_1641679825.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/498540077_2134406039_223132457.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/498540077_2134406039_223132457.mp4 new file mode 100644 index 0000000..bdafaeb Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation2x3/498540077_2134406039_223132457.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/2933822066_286234973_900026302.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/2933822066_286234973_900026302.mp4 new file mode 100644 index 0000000..387d071 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/2933822066_286234973_900026302.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/2933822066_4277570474_3976733403.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/2933822066_4277570474_3976733403.mp4 new file mode 100644 index 0000000..abd5b61 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/2933822066_4277570474_3976733403.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/2933822066_857220353_3777748651.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/2933822066_857220353_3777748651.mp4 new file mode 100644 index 0000000..71bc486 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/2933822066_857220353_3777748651.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/498540077_1073456939_223132457.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/498540077_1073456939_223132457.mp4 new file mode 100644 index 0000000..aef78f9 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation3x2/498540077_1073456939_223132457.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/2933822066_1716291363_1647168702.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/2933822066_1716291363_1647168702.mp4 new file mode 100644 index 0000000..259eae6 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/2933822066_1716291363_1647168702.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/2933822066_2630333845_4169855342.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/2933822066_2630333845_4169855342.mp4 new file mode 100644 index 0000000..c042a8f Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/2933822066_2630333845_4169855342.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/2933822066_4277570474_2839993380.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/2933822066_4277570474_2839993380.mp4 new file mode 100644 index 0000000..4fe8272 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/2933822066_4277570474_2839993380.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/498540077_118822790_223132457.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/498540077_118822790_223132457.mp4 new file mode 100644 index 0000000..5d136ff Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x3/498540077_118822790_223132457.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/2933822066_1557795140_374128077.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/2933822066_1557795140_374128077.mp4 new file mode 100644 index 0000000..277170e Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/2933822066_1557795140_374128077.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/2933822066_4277570474_3514208436.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/2933822066_4277570474_3514208436.mp4 new file mode 100644 index 0000000..d097488 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/2933822066_4277570474_3514208436.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/2933822066_973063406_3560967737.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/2933822066_973063406_3560967737.mp4 new file mode 100644 index 0000000..20acf3f Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/2933822066_973063406_3560967737.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/498540077_2948670807_223132457.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/498540077_2948670807_223132457.mp4 new file mode 100644 index 0000000..7cf0f7c Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation4x5/498540077_2948670807_223132457.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/2933822066_1842399710_1291134229.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/2933822066_1842399710_1291134229.mp4 new file mode 100644 index 0000000..c769c23 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/2933822066_1842399710_1291134229.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/2933822066_2270210369_508999682.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/2933822066_2270210369_508999682.mp4 new file mode 100644 index 0000000..f4fa0e6 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/2933822066_2270210369_508999682.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/2933822066_4277570474_2967280566.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/2933822066_4277570474_2967280566.mp4 new file mode 100644 index 0000000..4da6a0d Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/2933822066_4277570474_2967280566.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/498540077_2122747169_223132457.mp4 b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/498540077_2122747169_223132457.mp4 new file mode 100644 index 0000000..27565d9 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1080p30/partial_movie_files/OutroAnimation5x4/498540077_2122747169_223132457.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1920p30/outro_9x16.mp4 b/video-creator/assets/media/videos/outro_generator/1920p30/outro_9x16.mp4 new file mode 100644 index 0000000..3ec2e8c Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1920p30/outro_9x16.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1194513873_1624633994_2233391360.mp4 b/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1194513873_1624633994_2233391360.mp4 new file mode 100644 index 0000000..54f6cb4 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1194513873_1624633994_2233391360.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1194513873_3197887656_697976996.mp4 b/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1194513873_3197887656_697976996.mp4 new file mode 100644 index 0000000..584e786 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1194513873_3197887656_697976996.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1194513873_4277570474_1063621336.mp4 b/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1194513873_4277570474_1063621336.mp4 new file mode 100644 index 0000000..54f508b Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1194513873_4277570474_1063621336.mp4 differ diff --git a/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1469256112_2213673558_223132457.mp4 b/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1469256112_2213673558_223132457.mp4 new file mode 100644 index 0000000..7e7f508 Binary files /dev/null and b/video-creator/assets/media/videos/outro_generator/1920p30/partial_movie_files/OutroAnimation9x16/1469256112_2213673558_223132457.mp4 differ diff --git a/video-creator/assets/outro.mp4 b/video-creator/assets/outro.mp4 new file mode 100644 index 0000000..6814e4c Binary files /dev/null and b/video-creator/assets/outro.mp4 differ diff --git a/video-creator/assets/outro_1x1.mp4 b/video-creator/assets/outro_1x1.mp4 new file mode 100644 index 0000000..aa322e5 Binary files /dev/null and b/video-creator/assets/outro_1x1.mp4 differ diff --git a/video-creator/assets/outro_21x9.mp4 b/video-creator/assets/outro_21x9.mp4 new file mode 100644 index 0000000..c5a8705 Binary files /dev/null and b/video-creator/assets/outro_21x9.mp4 differ diff --git a/video-creator/assets/outro_2x3.mp4 b/video-creator/assets/outro_2x3.mp4 new file mode 100644 index 0000000..510f180 Binary files /dev/null and b/video-creator/assets/outro_2x3.mp4 differ diff --git a/video-creator/assets/outro_3x2.mp4 b/video-creator/assets/outro_3x2.mp4 new file mode 100644 index 0000000..13b3bdb Binary files /dev/null and b/video-creator/assets/outro_3x2.mp4 differ diff --git a/video-creator/assets/outro_3x4.mp4 b/video-creator/assets/outro_3x4.mp4 new file mode 100644 index 0000000..53daa4d Binary files /dev/null and b/video-creator/assets/outro_3x4.mp4 differ diff --git a/video-creator/assets/outro_4x3.mp4 b/video-creator/assets/outro_4x3.mp4 new file mode 100644 index 0000000..f6eae54 Binary files /dev/null and b/video-creator/assets/outro_4x3.mp4 differ diff --git a/video-creator/assets/outro_4x5.mp4 b/video-creator/assets/outro_4x5.mp4 new file mode 100644 index 0000000..c660899 Binary files /dev/null and b/video-creator/assets/outro_4x5.mp4 differ diff --git a/video-creator/assets/outro_5x4.mp4 b/video-creator/assets/outro_5x4.mp4 new file mode 100644 index 0000000..fcce801 Binary files /dev/null and b/video-creator/assets/outro_5x4.mp4 differ diff --git a/video-creator/assets/outro_9x16.mp4 b/video-creator/assets/outro_9x16.mp4 new file mode 100644 index 0000000..d434e76 Binary files /dev/null and b/video-creator/assets/outro_9x16.mp4 differ diff --git a/video-creator/assets/outro_generator.py b/video-creator/assets/outro_generator.py new file mode 100644 index 0000000..51c4ec2 --- /dev/null +++ b/video-creator/assets/outro_generator.py @@ -0,0 +1,260 @@ +#!/usr/bin/env python3 +""" +生成通用片尾动画 +支持三种尺寸:16:9、3:4、9:16 + +用法: + # 16:9 横版(默认) + manim -qh --format=mp4 --fps=30 -o outro.mp4 outro_generator.py OutroAnimation + + # 3:4 竖版 + manim -qh --format=mp4 --fps=30 -o outro_3x4.mp4 outro_generator.py OutroAnimation3x4 + + # 9:16 竖版(手机全屏) + manim -qh --format=mp4 --fps=30 -o outro_9x16.mp4 outro_generator.py OutroAnimation9x16 + +加语音(每种尺寸都要加): + edge-tts --text "点关注,不迷路!" --voice zh-CN-YunxiNeural --rate="+10%" --write-media outro_voice.mp3 + ffmpeg -y -i outro.mp4 -i outro_voice.mp3 -filter_complex "[1:a]adelay=1000|1000,apad=whole_dur=5.2[aout]" -map 0:v -map "[aout]" -c:v copy -c:a aac outro_with_voice.mp4 + mv outro_with_voice.mp4 outro.mp4 +""" +from manim import * +import os + +SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) +LOGO_PATH = os.path.join(SCRIPT_DIR, "logo.jpg") + + +class OutroAnimation(Scene): + """16:9 横版片尾""" + def construct(self): + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(1.8) + qr_code.shift(UP * 0.8) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=48, color=WHITE) + title.next_to(qr_code, DOWN, buff=0.6) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +class OutroAnimation3x4(Scene): + """3:4 竖版片尾""" + def construct(self): + config.pixel_width = 1080 + config.pixel_height = 1440 + config.frame_width = 8 + config.frame_height = 10.67 + + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(2.2) + qr_code.shift(UP * 1.5) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=42, color=WHITE) + title.next_to(qr_code, DOWN, buff=0.8) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +class OutroAnimation9x16(Scene): + """9:16 竖版片尾(手机全屏)""" + def construct(self): + config.pixel_width = 1080 + config.pixel_height = 1920 + config.frame_width = 8 + config.frame_height = 14.22 + + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(2.5) + qr_code.shift(UP * 2) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=40, color=WHITE) + title.next_to(qr_code, DOWN, buff=1.0) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +class OutroAnimation1x1(Scene): + """1:1 正方形片尾""" + def construct(self): + config.pixel_width = 1024 + config.pixel_height = 1024 + config.frame_width = 8 + config.frame_height = 8 + + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(2.0) + qr_code.shift(UP * 0.5) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=44, color=WHITE) + title.next_to(qr_code, DOWN, buff=0.6) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +class OutroAnimation2x3(Scene): + """2:3 竖版片尾""" + def construct(self): + config.pixel_width = 832 + config.pixel_height = 1248 + config.frame_width = 8 + config.frame_height = 12 + + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(2.0) + qr_code.shift(UP * 1.2) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=38, color=WHITE) + title.next_to(qr_code, DOWN, buff=0.8) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +class OutroAnimation3x2(Scene): + """3:2 横版片尾""" + def construct(self): + config.pixel_width = 1248 + config.pixel_height = 832 + config.frame_width = 12 + config.frame_height = 8 + + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(1.6) + qr_code.shift(UP * 0.6) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=46, color=WHITE) + title.next_to(qr_code, DOWN, buff=0.5) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +class OutroAnimation4x3(Scene): + """4:3 横版片尾""" + def construct(self): + config.pixel_width = 1440 + config.pixel_height = 1080 + config.frame_width = 10.67 + config.frame_height = 8 + + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(1.8) + qr_code.shift(UP * 0.7) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=46, color=WHITE) + title.next_to(qr_code, DOWN, buff=0.6) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +class OutroAnimation4x5(Scene): + """4:5 竖版片尾(Instagram)""" + def construct(self): + config.pixel_width = 864 + config.pixel_height = 1080 + config.frame_width = 8 + config.frame_height = 10 + + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(2.0) + qr_code.shift(UP * 1.0) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=36, color=WHITE) + title.next_to(qr_code, DOWN, buff=0.7) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +class OutroAnimation5x4(Scene): + """5:4 横版片尾""" + def construct(self): + config.pixel_width = 1080 + config.pixel_height = 864 + config.frame_width = 10 + config.frame_height = 8 + + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(1.7) + qr_code.shift(UP * 0.6) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=44, color=WHITE) + title.next_to(qr_code, DOWN, buff=0.5) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +class OutroAnimation21x9(Scene): + """21:9 超宽屏片尾""" + def construct(self): + config.pixel_width = 1536 + config.pixel_height = 672 + config.frame_width = 18.29 + config.frame_height = 8 + + self.camera.background_color = "#1a1a2e" + + qr_code = ImageMobject(LOGO_PATH) + qr_code.scale(1.3) + qr_code.shift(LEFT * 4) + + title = Text("点点关注 一起学 AI", font="PingFang SC", font_size=52, color=WHITE) + title.shift(RIGHT * 2) + + self.play(GrowFromCenter(qr_code), run_time=0.8) + self.play(Write(title), run_time=1.0) + self.play(qr_code.animate.scale(1.05), rate_func=there_and_back, run_time=0.4) + self.wait(3) + + +if __name__ == "__main__": + print("生成片尾动画...") + print("\n16:9 横版:") + print(" manim -qh --format=mp4 --fps=30 -o outro.mp4 outro_generator.py OutroAnimation") + print("\n3:4 竖版:") + print(" manim -qh --format=mp4 --fps=30 -o outro_3x4.mp4 outro_generator.py OutroAnimation3x4") + print("\n9:16 竖版:") + print(" manim -qh --format=mp4 --fps=30 -o outro_9x16.mp4 outro_generator.py OutroAnimation9x16") diff --git a/video-creator/assets/outro_voice.mp3 b/video-creator/assets/outro_voice.mp3 new file mode 100644 index 0000000..7e9fb75 Binary files /dev/null and b/video-creator/assets/outro_voice.mp3 differ diff --git a/video-creator/references/edge_tts_voices.md b/video-creator/references/edge_tts_voices.md new file mode 100644 index 0000000..0bae598 --- /dev/null +++ b/video-creator/references/edge_tts_voices.md @@ -0,0 +1,72 @@ +# Edge-TTS 中文音色参考 + +## 常用音色推荐 + +| 音色 ID | 性别 | 风格 | 适用场景 | +|---------|------|------|----------| +| zh-CN-YunxiNeural | 男 | 新闻播报风 | 资讯类、技术分享 | +| zh-CN-YunyangNeural | 男 | 温和亲切 | 教程、讲解 | +| zh-CN-XiaoxiaoNeural | 女 | 活泼自然 | 日常分享、生活类 | +| zh-CN-XiaoyiNeural | 女 | 温柔知性 | 文艺、情感类 | +| zh-CN-YunjianNeural | 男 | 浑厚大气 | 宣传片、正式场合 | +| zh-CN-XiaochenNeural | 女 | 甜美可爱 | 儿童内容、轻松话题 | + +## 全部中文音色列表 + +### 普通话(大陆) +- zh-CN-XiaoxiaoNeural (女) +- zh-CN-XiaoyiNeural (女) +- zh-CN-YunjianNeural (男) +- zh-CN-YunxiNeural (男) +- zh-CN-YunyangNeural (男) +- zh-CN-XiaochenNeural (女) +- zh-CN-XiaohanNeural (女) +- zh-CN-XiaomengNeural (女) +- zh-CN-XiaomoNeural (女) +- zh-CN-XiaoqiuNeural (女) +- zh-CN-XiaoruiNeural (女) +- zh-CN-XiaoshuangNeural (女) +- zh-CN-XiaoxuanNeural (女) +- zh-CN-XiaoyanNeural (女) +- zh-CN-XiaoyouNeural (女) +- zh-CN-XiaozhenNeural (女) +- zh-CN-YunfengNeural (男) +- zh-CN-YunhaoNeural (男) +- zh-CN-YunzeNeural (男) + +### 粤语(香港) +- zh-HK-HiuGaaiNeural (女) +- zh-HK-HiuMaanNeural (女) +- zh-HK-WanLungNeural (男) + +### 台湾 +- zh-TW-HsiaoChenNeural (女) +- zh-TW-HsiaoYuNeural (女) +- zh-TW-YunJheNeural (男) + +## 语速和音调调整 + +### 语速 (rate) +- 加快: `+10%`, `+20%`, `+50%` +- 减慢: `-10%`, `-20%`, `-30%` +- 默认: `+0%` + +### 音调 (pitch) +- 升高: `+5Hz`, `+10Hz`, `+20Hz` +- 降低: `-5Hz`, `-10Hz`, `-20Hz` +- 默认: `+0Hz` + +### 配置示例 + +```yaml +voice: + name: "zh-CN-YunxiNeural" + rate: "+10%" # 稍快一点 + pitch: "-5Hz" # 稍低沉一点 +``` + +## 查看所有可用音色 + +```bash +python scripts/tts_generator.py --list-voices +``` diff --git a/video-creator/scripts/scene_splitter.py b/video-creator/scripts/scene_splitter.py new file mode 100644 index 0000000..15e64ba --- /dev/null +++ b/video-creator/scripts/scene_splitter.py @@ -0,0 +1,221 @@ +#!/usr/bin/env python3 +""" +场景拆分器 - 将口播文本拆分成细镜头 +基于时间戳对齐图片和字幕 +""" + +import json +import re +import argparse +from pathlib import Path +from typing import List, Dict + + +def split_by_sentence_timestamps(timestamps: List[Dict]) -> List[Dict]: + """ + 直接使用 TTS 的 SentenceBoundary 时间戳作为镜头分割 + + Args: + timestamps: TTS 输出的时间戳(包含 sentence 类型) + + Returns: + 每个镜头的信息:text, start, end, duration + """ + shots = [] + for ts in timestamps: + if ts.get("type") == "sentence": + shots.append({ + "text": ts["text"], + "start": ts["start"], + "end": ts["end"], + "duration": round(ts["end"] - ts["start"], 2) + }) + + if not shots and timestamps: + total_start = timestamps[0]["start"] + total_end = timestamps[-1]["end"] + full_text = "".join(ts["text"] for ts in timestamps) + shots.append({ + "text": full_text, + "start": total_start, + "end": total_end, + "duration": round(total_end - total_start, 2) + }) + + return shots + + +def split_long_shots(shots: List[Dict], max_duration: float = 6.0) -> List[Dict]: + """ + 将过长的镜头按标点符号进一步拆分 + + Args: + shots: 镜头列表 + max_duration: 最大镜头时长 + + Returns: + 拆分后的镜头列表 + """ + result = [] + + for shot in shots: + if shot["duration"] <= max_duration: + result.append(shot) + continue + + text = shot["text"] + splits = re.split(r'([,。!?,!?])', text) + + sub_texts = [] + current = "" + for i, part in enumerate(splits): + current += part + if i % 2 == 1 and current.strip(): + sub_texts.append(current.strip()) + current = "" + if current.strip(): + sub_texts.append(current.strip()) + + if len(sub_texts) <= 1: + result.append(shot) + continue + + total_chars = sum(len(t) for t in sub_texts) + current_time = shot["start"] + + for sub_text in sub_texts: + ratio = len(sub_text) / total_chars + sub_duration = shot["duration"] * ratio + result.append({ + "text": sub_text, + "start": round(current_time, 2), + "end": round(current_time + sub_duration, 2), + "duration": round(sub_duration, 2) + }) + current_time += sub_duration + + return result + + +def merge_short_shots(shots: List[Dict], min_duration: float = 2.5) -> List[Dict]: + """合并过短的镜头""" + if not shots: + return shots + + merged = [] + current = shots[0].copy() + + for shot in shots[1:]: + if current["duration"] < min_duration: + current["text"] += shot["text"] + current["end"] = shot["end"] + current["duration"] = round(current["end"] - current["start"], 2) + else: + merged.append(current) + current = shot.copy() + + merged.append(current) + return merged + + +def generate_shot_prompts(shots: List[Dict], style: str, context: str = "") -> List[Dict]: + """ + 为每个镜头生成图片提示词 + + Args: + shots: 镜头列表 + style: 画风描述 + context: 上下文(如角色描述) + + Returns: + 带有图片提示词的镜头列表 + """ + for i, shot in enumerate(shots): + shot["image_prompt"] = f"{style},{context},画面:{shot['text']}。禁止出现任何文字" + shot["index"] = i + 1 + + return shots + + +def generate_srt(shots: List[Dict], output_path: str): + """生成 SRT 字幕文件""" + def format_time(seconds: float) -> str: + hours = int(seconds // 3600) + minutes = int((seconds % 3600) // 60) + secs = int(seconds % 60) + millis = int((seconds % 1) * 1000) + return f"{hours:02d}:{minutes:02d}:{secs:02d},{millis:03d}" + + with open(output_path, "w", encoding="utf-8") as f: + for i, shot in enumerate(shots, 1): + f.write(f"{i}\n") + f.write(f"{format_time(shot['start'])} --> {format_time(shot['end'])}\n") + f.write(f"{shot['text']}\n\n") + + print(f" ✓ 字幕: {output_path}") + + +def process_scene(text: str, timestamps_path: str, style: str, context: str = "", output_dir: str = ".") -> Dict: + """ + 处理单个场景,输出镜头配置 + + Args: + text: 场景口播文本 + timestamps_path: TTS 时间戳 JSON 文件 + style: 画风 + context: 上下文 + output_dir: 输出目录 + + Returns: + 场景配置字典 + """ + with open(timestamps_path, "r", encoding="utf-8") as f: + timestamps = json.load(f) + + shots = split_by_sentence_timestamps(timestamps) + + shots = split_long_shots(shots, max_duration=6.0) + + shots = merge_short_shots(shots, min_duration=2.5) + + shots = generate_shot_prompts(shots, style, context) + + output_path = Path(output_dir) + output_path.mkdir(parents=True, exist_ok=True) + + srt_path = output_path / "subtitles.srt" + generate_srt(shots, str(srt_path)) + + config_path = output_path / "shots.json" + with open(config_path, "w", encoding="utf-8") as f: + json.dump(shots, f, ensure_ascii=False, indent=2) + print(f" ✓ 镜头配置: {config_path}") + + return {"shots": shots, "srt_path": str(srt_path)} + + +def main(): + parser = argparse.ArgumentParser(description='场景拆分器') + parser.add_argument('--text', type=str, required=True, help='口播文本') + parser.add_argument('--timestamps', type=str, required=True, help='TTS时间戳JSON文件') + parser.add_argument('--style', type=str, default='', help='画风描述') + parser.add_argument('--context', type=str, default='', help='上下文(角色等)') + parser.add_argument('--output-dir', type=str, default='.', help='输出目录') + + args = parser.parse_args() + + result = process_scene( + text=args.text, + timestamps_path=args.timestamps, + style=args.style, + context=args.context, + output_dir=args.output_dir + ) + + print(f"\n拆分完成,共 {len(result['shots'])} 个镜头:") + for shot in result["shots"]: + print(f" [{shot['duration']:.1f}s] {shot['text']}") + + +if __name__ == "__main__": + main() diff --git a/video-creator/scripts/tts_generator.py b/video-creator/scripts/tts_generator.py new file mode 100644 index 0000000..7e78573 --- /dev/null +++ b/video-creator/scripts/tts_generator.py @@ -0,0 +1,123 @@ +#!/usr/bin/env python3 +""" +TTS 语音生成器 - 使用 edge-tts +支持时间戳输出,用于字幕同步和镜头切换 +""" + +import asyncio +import argparse +import os +import json +import yaml +import edge_tts + + +async def generate_tts(text: str, voice: str, output_path: str, rate: str = "+0%", pitch: str = "+0Hz", with_timestamps: bool = False): + """生成单条语音,可选输出时间戳""" + communicate = edge_tts.Communicate(text, voice, rate=rate, pitch=pitch) + + if with_timestamps: + timestamps = [] + audio_chunks = [] + + async for chunk in communicate.stream(): + chunk_type = chunk.get("type", "") + if chunk_type == "audio": + audio_chunks.append(chunk.get("data", b"")) + elif chunk_type == "WordBoundary": + timestamps.append({ + "text": chunk.get("text", ""), + "start": chunk.get("offset", 0) / 10000000, + "end": (chunk.get("offset", 0) + chunk.get("duration", 0)) / 10000000 + }) + elif chunk_type == "SentenceBoundary": + timestamps.append({ + "text": chunk.get("text", ""), + "start": chunk.get("offset", 0) / 10000000, + "end": (chunk.get("offset", 0) + chunk.get("duration", 0)) / 10000000, + "type": "sentence" + }) + + with open(output_path, "wb") as f: + for data in audio_chunks: + f.write(data) + + ts_path = output_path.rsplit(".", 1)[0] + ".json" + with open(ts_path, "w", encoding="utf-8") as f: + json.dump(timestamps, f, ensure_ascii=False, indent=2) + + print(f" ✓ 生成: {output_path} + 时间戳") + return timestamps + else: + await communicate.save(output_path) + print(f" ✓ 生成: {output_path}") + return None + + +async def generate_batch(config_path: str, output_dir: str): + """批量生成语音""" + with open(config_path, 'r', encoding='utf-8') as f: + config = yaml.safe_load(f) + + os.makedirs(output_dir, exist_ok=True) + + voice_config = config.get('voice', {}) + voice_name = voice_config.get('name', 'zh-CN-YunxiNeural') + rate = voice_config.get('rate', '+0%') + pitch = voice_config.get('pitch', '+0Hz') + + scenes = config.get('scenes', []) + tasks = [] + + for i, scene in enumerate(scenes): + text = scene.get('text', '') + if not text: + continue + output_path = os.path.join(output_dir, f"{i:03d}.mp3") + tasks.append(generate_tts(text, voice_name, output_path, rate, pitch)) + + print(f"开始生成 {len(tasks)} 条语音...") + await asyncio.gather(*tasks) + print(f"✓ 完成!语音文件保存在: {output_dir}") + + +async def list_voices(): + """列出所有可用音色""" + voices = await edge_tts.list_voices() + zh_voices = [v for v in voices if v['Locale'].startswith('zh')] + + print("\n中文可用音色:") + print("-" * 60) + for v in zh_voices: + gender = "♂" if v['Gender'] == 'Male' else "♀" + print(f"{gender} {v['ShortName']:<30} {v['Locale']}") + print("-" * 60) + print(f"共 {len(zh_voices)} 个中文音色") + + +def main(): + parser = argparse.ArgumentParser(description='Edge-TTS 语音生成器') + parser.add_argument('--text', type=str, help='要转换的文本') + parser.add_argument('--voice', type=str, default='zh-CN-YunxiNeural', help='音色名称') + parser.add_argument('--rate', type=str, default='+0%', help='语速调整') + parser.add_argument('--pitch', type=str, default='+0Hz', help='音调调整') + parser.add_argument('--output', type=str, help='输出文件路径') + parser.add_argument('--timestamps', action='store_true', help='输出时间戳JSON文件') + parser.add_argument('--config', type=str, help='配置文件路径(批量生成)') + parser.add_argument('--output-dir', type=str, default='temp/audio', help='批量输出目录') + parser.add_argument('--list-voices', action='store_true', help='列出可用音色') + + args = parser.parse_args() + + if args.list_voices: + asyncio.run(list_voices()) + elif args.config: + asyncio.run(generate_batch(args.config, args.output_dir)) + elif args.text and args.output: + asyncio.run(generate_tts(args.text, args.voice, args.output, args.rate, args.pitch, args.timestamps)) + else: + parser.print_help() + + +if __name__ == '__main__': + main() diff --git a/video-creator/scripts/video_maker.py b/video-creator/scripts/video_maker.py new file mode 100644 index 0000000..6968399 --- /dev/null +++ b/video-creator/scripts/video_maker.py @@ -0,0 +1,530 @@ +#!/usr/bin/env python3 +""" +视频生成器 - 图片+音频合成视频 +支持:淡入淡出转场、自动拼接片尾、添加BGM + +用法: + python video_maker.py config.yaml + python video_maker.py config.yaml --no-outro # 不加片尾 + python video_maker.py config.yaml --no-bgm # 不加BGM +""" +import argparse +import os +import subprocess +import sys +import yaml +from pathlib import Path + +SCRIPT_DIR = Path(__file__).parent +SKILL_DIR = SCRIPT_DIR.parent +ASSETS_DIR = SKILL_DIR / "assets" +BGM_DEFAULT = ASSETS_DIR / "bgm_technology.mp3" +BGM_EPIC = ASSETS_DIR / "bgm_epic.mp3" + +VALID_ASPECT_RATIOS = [ + "1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9" +] + +RATIO_TO_SIZE = { + "1:1": (1024, 1024), + "2:3": (832, 1248), + "3:2": (1248, 832), + "3:4": (1080, 1440), + "4:3": (1440, 1080), + "4:5": (864, 1080), + "5:4": (1080, 864), + "9:16": (1080, 1920), + "16:9": (1920, 1080), + "21:9": (1536, 672), +} + +def get_outro_path(ratio): + """根据比例获取片尾路径,优先精确匹配,否则按方向匹配,最后兜底""" + ratio_file = ASSETS_DIR / f"outro_{ratio.replace(':', 'x')}.mp4" + if ratio_file.exists(): + return ratio_file + + w, h = RATIO_TO_SIZE.get(ratio, (1920, 1080)) + if h > w: + candidates = ["outro_9x16.mp4", "outro_3x4.mp4"] + elif w > h: + candidates = ["outro.mp4", "outro_3x4.mp4"] + else: + candidates = ["outro_1x1.mp4", "outro.mp4"] + + for name in candidates: + fallback = ASSETS_DIR / name + if fallback.exists(): + return fallback + + return ASSETS_DIR / "outro.mp4" + + +def run_cmd(cmd, desc=""): + """执行命令并返回结果""" + if desc: + print(f" {desc}...") + result = subprocess.run(cmd, capture_output=True, text=True) + if result.returncode != 0: + print(f"错误: {result.stderr[-1000:]}") + sys.exit(1) + return result + + +def get_duration(file_path): + """获取音视频时长""" + result = subprocess.run([ + 'ffprobe', '-v', 'error', '-show_entries', 'format=duration', + '-of', 'csv=p=0', str(file_path) + ], capture_output=True, text=True) + return float(result.stdout.strip()) + + +def generate_video_with_transitions(images, durations, output_path, fade_duration=0.5, ratio="16:9"): + """生成带转场的视频""" + print(f"\n[1/4] 生成主视频 ({len(images)}张图片, {fade_duration}秒转场)") + + width, height = RATIO_TO_SIZE.get(ratio, (1920, 1080)) + + display_durations = [] + for i, dur in enumerate(durations): + if i < len(durations) - 1: + display_durations.append(dur + fade_duration) + else: + display_durations.append(dur) + + inputs = [] + for img, dur in zip(images, display_durations): + inputs.extend(['-loop', '1', '-t', str(dur), '-i', str(img)]) + + filter_parts = [] + for i in range(len(images)): + filter_parts.append( + f"[{i}:v]scale={width}:{height}:force_original_aspect_ratio=decrease," + f"pad={width}:{height}:(ow-iw)/2:(oh-ih)/2,setsar=1,fps=30[v{i}];" + ) + + offset = 0 + for i in range(len(images) - 1): + if i == 0: + offset = display_durations[0] - fade_duration + filter_parts.append( + f"[v0][v1]xfade=transition=fade:duration={fade_duration}:offset={offset}[xf1];" + ) + else: + offset += display_durations[i] - fade_duration + filter_parts.append( + f"[xf{i}][v{i+1}]xfade=transition=fade:duration={fade_duration}:offset={offset}[xf{i+1}];" + ) + + last_xf = f"xf{len(images)-1}" + filter_complex = ''.join(filter_parts).rstrip(';') + + cmd = ['ffmpeg', '-y'] + inputs + [ + '-filter_complex', filter_complex, + '-map', f'[{last_xf}]', + '-c:v', 'libx264', '-preset', 'fast', '-crf', '20', '-pix_fmt', 'yuv420p', + str(output_path) + ] + + run_cmd(cmd, f"合成{len(images)}张图片") + print(f" ✓ 主视频: {get_duration(output_path):.1f}秒") + + +def merge_audio(audio_files, output_path): + """合并音频文件""" + print(f"\n[2/4] 合并音频 ({len(audio_files)}个文件)") + + concat_file = output_path.parent / "audio_concat.txt" + with open(concat_file, 'w') as f: + for audio in audio_files: + f.write(f"file '{audio.absolute()}'\n") + + cmd = [ + 'ffmpeg', '-y', '-f', 'concat', '-safe', '0', '-i', str(concat_file), + '-af', 'aresample=44100', '-c:a', 'aac', '-b:a', '192k', str(output_path) + ] + run_cmd(cmd, "合并音频") + concat_file.unlink() + print(f" ✓ 音频: {get_duration(output_path):.1f}秒") + + +def combine_video_audio(video_path, audio_path, output_path): + """合并视频和音频""" + cmd = [ + 'ffmpeg', '-y', '-i', str(video_path), '-i', str(audio_path), + '-c:v', 'copy', '-c:a', 'copy', '-shortest', str(output_path) + ] + run_cmd(cmd, "合并视频音频") + + +def append_outro(video_path, output_path, fade_duration=0.5, ratio="16:9"): + """拼接片尾,自动缩放片尾到主视频分辨率""" + print(f"\n[3/4] 拼接片尾") + + outro_file = get_outro_path(ratio) + if not outro_file.exists(): + print(f" ⚠ 片尾文件不存在: {outro_file}") + return video_path + + width, height = RATIO_TO_SIZE.get(ratio, (1920, 1080)) + + outro_ready = output_path.parent / "outro_ready.mp4" + cmd = [ + 'ffmpeg', '-y', '-i', str(outro_file), + '-vf', f'scale={width}:{height}:force_original_aspect_ratio=decrease,pad={width}:{height}:(ow-iw)/2:(oh-ih)/2,setsar=1', + '-c:v', 'libx264', '-preset', 'fast', '-crf', '20', + '-c:a', 'aac', '-ar', '44100', str(outro_ready) + ] + run_cmd(cmd, "准备片尾") + + video_duration = get_duration(video_path) + fade_start = video_duration - fade_duration + + cmd = [ + 'ffmpeg', '-y', '-i', str(video_path), '-i', str(outro_ready), + '-filter_complex', + f"[0:v]fade=t=out:st={fade_start}:d={fade_duration}[v0];" + f"[1:v]fade=t=in:st=0:d={fade_duration}[v1];" + f"[v0][v1]concat=n=2:v=1:a=0[vout];" + f"[0:a][1:a]concat=n=2:v=0:a=1[aout]", + '-map', '[vout]', '-map', '[aout]', + '-c:v', 'libx264', '-preset', 'fast', '-crf', '20', + '-c:a', 'aac', '-b:a', '192k', str(output_path) + ] + run_cmd(cmd, "拼接片尾") + outro_ready.unlink() + print(f" ✓ 含片尾: {get_duration(output_path):.1f}秒") + return output_path + + +def burn_subtitles(video_path, srt_path, output_path, ratio="16:9"): + """烧录字幕到视频:底部居中固定位置""" + print(f"\n[字幕] 烧录字幕") + + if not Path(srt_path).exists(): + print(f" ⚠ 字幕文件不存在: {srt_path}") + return video_path + + width, height = RATIO_TO_SIZE.get(ratio, (1920, 1080)) + # 字体大小:高度/25,16:9时约43px,9:16时约77px + font_size = max(36, int(height / 25)) + margin_bottom = int(height / 15) + + ass_path = Path(srt_path).with_suffix('.ass') + srt_to_ass(srt_path, ass_path, width, height, font_size, margin_bottom) + + ass_escaped = str(ass_path).replace(":", r"\:").replace("'", r"\'") + + cmd = [ + 'ffmpeg', '-y', '-i', str(video_path), + '-vf', f"ass='{ass_escaped}'", + '-c:v', 'libx264', '-preset', 'fast', '-crf', '20', + '-c:a', 'copy', str(output_path) + ] + run_cmd(cmd, "烧录字幕") + print(f" ✓ 含字幕: {get_duration(output_path):.1f}秒") + return output_path + + +def srt_to_ass(srt_path, ass_path, width, height, font_size, margin_bottom): + """将 SRT 转换为 ASS 格式,固定底部居中,自动换行""" + import re + + with open(srt_path, 'r', encoding='utf-8') as f: + srt_content = f.read() + + # 每行字数规则表(按分辨率宽度固定) + CHARS_PER_LINE_MAP = { + 1024: 20, # 1:1 + 832: 14, # 2:3 + 1248: 32, # 3:2 + 1080: 16, # 3:4, 4:5, 5:4, 9:16 (竖版统一16字) + 1440: 28, # 4:3 + 864: 17, # 4:5 + 1920: 38, # 16:9 + 1536: 48, # 21:9 + } + # 查表,找不到则按公式计算 + MAX_CHARS = CHARS_PER_LINE_MAP.get(width) + if MAX_CHARS is None: + # 兜底:按宽度和字体大小估算 + MAX_CHARS = max(12, int(width / (font_size * 1.2))) + + # 标点符号(不能放行首) + PUNCTUATION = ',。、:;?!,.:;?!)】」》\'\"' + + def find_break_point(text, max_pos): + """找到合适的断点位置,优先在空格处断开""" + if max_pos >= len(text): + return len(text) + + # 从max_pos往前找空格断点 + for i in range(max_pos, max(max_pos // 2, 1), -1): + if text[i] == ' ': + return i + # 没找到空格就直接断 + return max_pos + + def wrap_text_2lines(text): + """换行:严格2行,返回单个2行字幕块""" + text = text.strip() + if len(text) <= MAX_CHARS: + return text + r'\N ' + + # 找第一行断点 + break1 = find_break_point(text, MAX_CHARS) + line1 = text[:break1].strip() + line2 = text[break1:].strip() + + # 第二行也限制长度 + if len(line2) > MAX_CHARS: + break2 = find_break_point(line2, MAX_CHARS) + line2 = line2[:break2].strip() + + return line1 + r'\N' + line2 + + def split_long_text(text, start_sec, end_sec): + """长文本拆成多条字幕,每条严格2行,时间均分""" + text = text.strip() + + # 先模拟换行,计算实际需要几条字幕 + blocks = [] + remaining = text + while remaining: + # 第一行 + if len(remaining) <= MAX_CHARS: + blocks.append(remaining) + break + break1 = find_break_point(remaining, MAX_CHARS) + line1 = remaining[:break1].strip() + rest = remaining[break1:].strip() + + # 第二行 + if len(rest) <= MAX_CHARS: + blocks.append(line1 + ' ' + rest) + break + break2 = find_break_point(rest, MAX_CHARS) + line2 = rest[:break2].strip() + blocks.append(line1 + ' ' + line2) + remaining = rest[break2:].strip() + + # 时间均分 + duration = end_sec - start_sec + time_per_block = duration / len(blocks) + + result = [] + for i, block in enumerate(blocks): + block_start = start_sec + i * time_per_block + block_end = start_sec + (i + 1) * time_per_block + result.append((block, block_start, block_end)) + + return result + + ass_header = f"""[Script Info] +Title: Subtitles +ScriptType: v4.00+ +PlayResX: {width} +PlayResY: {height} +WrapStyle: 0 + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,PingFang SC,{font_size},&H00FFFFFF,&H000000FF,&H00000000,&H80000000,0,0,0,0,100,100,0,0,1,2,1,2,10,10,{margin_bottom},1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +""" + + def sec_to_ass_time(sec): + """秒数转ASS时间格式""" + h = int(sec // 3600) + m = int((sec % 3600) // 60) + s = int(sec % 60) + cs = int((sec % 1) * 100) + return f"{h}:{m:02d}:{s:02d}.{cs:02d}" + + events = [] + blocks = re.split(r'\n\n+', srt_content.strip()) + + for block in blocks: + lines = block.strip().split('\n') + if len(lines) >= 3: + time_line = lines[1] + text = ' '.join(lines[2:]).replace('\n', ' ') + # 标点符号替换为空格,便于换行分割 + text = re.sub(r'[,。、:;?!,.:;?!""''「」『』【】()()《》]', ' ', text) + # 合并多个空格为一个 + text = re.sub(r'\s+', ' ', text).strip() + + match = re.match(r'(\d{2}):(\d{2}):(\d{2}),(\d{3}) --> (\d{2}):(\d{2}):(\d{2}),(\d{3})', time_line) + if match: + sh, sm, ss, sms = match.groups()[:4] + eh, em, es, ems = match.groups()[4:] + start_sec = int(sh) * 3600 + int(sm) * 60 + int(ss) + int(sms) / 1000 + end_sec = int(eh) * 3600 + int(em) * 60 + int(es) + int(ems) / 1000 + + # 长文本拆成多条字幕 + sub_blocks = split_long_text(text, start_sec, end_sec) + for sub_text, sub_start, sub_end in sub_blocks: + formatted_text = wrap_text_2lines(sub_text) + start = sec_to_ass_time(sub_start) + end = sec_to_ass_time(sub_end) + events.append(f"Dialogue: 0,{start},{end},Default,,0,0,0,,{formatted_text}") + + with open(ass_path, 'w', encoding='utf-8') as f: + f.write(ass_header + '\n'.join(events)) + + +def add_bgm(video_path, output_path, volume=0.08, bgm_path=None): + """添加背景音乐""" + print(f"\n[4/4] 添加BGM") + + if bgm_path is None: + bgm_path = BGM_DEFAULT + bgm_path = Path(bgm_path) + + if not bgm_path.exists(): + print(f" ⚠ BGM文件不存在: {bgm_path}") + return video_path + + cmd = [ + 'ffmpeg', '-y', '-i', str(video_path), + '-stream_loop', '-1', '-i', str(bgm_path), + '-filter_complex', + f"[1:a]volume={volume}[bgm];[0:a][bgm]amix=inputs=2:duration=first[aout]", + '-map', '0:v', '-map', '[aout]', + '-c:v', 'copy', '-c:a', 'aac', '-b:a', '192k', str(output_path) + ] + run_cmd(cmd, "添加BGM") + print(f" ✓ 最终视频: {get_duration(output_path):.1f}秒") + return output_path + + +def main(): + parser = argparse.ArgumentParser(description='视频生成器') + parser.add_argument('config', help='配置文件路径 (YAML)') + parser.add_argument('--no-outro', action='store_true', help='不添加片尾') + parser.add_argument('--no-bgm', action='store_true', help='不添加BGM') + parser.add_argument('--fade', type=float, default=0.5, help='转场时长(秒)') + parser.add_argument('--bgm-volume', type=float, default=0.08, help='BGM音量') + parser.add_argument('--bgm', type=str, default=None, help='自定义BGM路径,可选: epic') + parser.add_argument('--ratio', type=str, default='16:9', + help=f'视频比例,支持: {", ".join(VALID_ASPECT_RATIOS)}') + parser.add_argument('--srt', type=str, default=None, help='字幕文件路径(SRT格式)') + args = parser.parse_args() + + config_path = Path(args.config) + if not config_path.exists(): + print(f"配置文件不存在: {config_path}") + sys.exit(1) + + with open(config_path) as f: + config = yaml.safe_load(f) + + work_dir = config_path.parent + output_path = work_dir / config.get('output', 'output.mp4') + + if args.ratio == '16:9' and 'ratio' in config: + args.ratio = config['ratio'] + + if 'bgm_volume' in config and args.bgm_volume == 0.08: + args.bgm_volume = config['bgm_volume'] + + if args.ratio not in VALID_ASPECT_RATIOS: + print(f"错误: 不支持的比例 '{args.ratio}'") + print(f"支持的比例: {', '.join(VALID_ASPECT_RATIOS)}") + sys.exit(1) + + scenes = config.get('scenes', []) + if not scenes: + print("配置文件中没有 scenes") + sys.exit(1) + + images = [] + durations = [] + audio_files = [] + + for scene in scenes: + audio = work_dir / scene['audio'] + if not audio.exists(): + print(f"音频不存在: {audio}") + sys.exit(1) + audio_files.append(audio) + + if 'images' in scene: + for img_cfg in scene['images']: + img = work_dir / img_cfg['file'] + if not img.exists(): + print(f"图片不存在: {img}") + sys.exit(1) + images.append(img) + durations.append(img_cfg['duration']) + else: + img = work_dir / scene['image'] + if not img.exists(): + print(f"图片不存在: {img}") + sys.exit(1) + images.append(img) + durations.append(get_duration(audio)) + + total_audio_duration = sum(get_duration(af) for af in audio_files) + total_image_duration = sum(durations) + + if total_image_duration < total_audio_duration: + gap = total_audio_duration - total_image_duration + 0.5 + durations[-1] += gap + print(f"\n⚠ 图片时长({total_image_duration:.1f}s) < 音频时长({total_audio_duration:.1f}s)") + print(f" 自动拉伸最后一张图片 +{gap:.1f}s") + + print(f"\n{'='*50}") + print(f"视频生成器") + print(f"{'='*50}") + print(f"场景数: {len(scenes)}") + print(f"音频时长: {total_audio_duration:.1f}秒") + print(f"视频时长: {sum(durations):.1f}秒") + print(f"转场: {args.fade}秒 淡入淡出") + print(f"片尾: {'是' if not args.no_outro else '否'}") + print(f"BGM: {'是' if not args.no_bgm else '否'}") + + temp_dir = work_dir / "temp" + temp_dir.mkdir(exist_ok=True) + + video_only = temp_dir / "video_only.mp4" + generate_video_with_transitions(images, durations, video_only, args.fade, args.ratio) + + audio_merged = temp_dir / "audio_merged.m4a" + merge_audio(audio_files, audio_merged) + + video_with_audio = temp_dir / "video_with_audio.mp4" + combine_video_audio(video_only, audio_merged, video_with_audio) + + current_video = video_with_audio + + if args.srt: + srt_path = work_dir / args.srt if not Path(args.srt).is_absolute() else Path(args.srt) + video_with_subs = temp_dir / "video_with_subs.mp4" + current_video = burn_subtitles(current_video, srt_path, video_with_subs, args.ratio) + + if not args.no_outro: + video_with_outro = temp_dir / "video_with_outro.mp4" + current_video = append_outro(current_video, video_with_outro, args.fade, args.ratio) + + if not args.no_bgm: + bgm_path = None + if args.bgm: + if args.bgm == 'epic': + bgm_path = BGM_EPIC + else: + bgm_path = Path(args.bgm) + add_bgm(current_video, output_path, args.bgm_volume, bgm_path) + else: + subprocess.run(['cp', str(current_video), str(output_path)]) + + print(f"\n{'='*50}") + print(f"✅ 完成: {output_path}") + print(f"{'='*50}\n") + + +if __name__ == "__main__": + main() diff --git a/videocut-clip-oral/README.md b/videocut-clip-oral/README.md new file mode 100644 index 0000000..8a398f3 --- /dev/null +++ b/videocut-clip-oral/README.md @@ -0,0 +1,30 @@ + + +# videocut:剪口播 + +> 口播视频剪辑助手 + +## 文件清单 + +| 文件 | 功能 | +|------|------| +| `SKILL.md` | 入口:流程、触发词、核心规则 | +| `tips/口误识别方法论.md` | 核心:口误识别 + 时间戳对齐 | +| `tips/转录最佳实践.md` | 辅助:FunASR 转录参数 | + +## 输入输出 + +``` +输入: 01-xxx-v1.mp4 (原始视频) + ↓ +输出: 01-xxx-v1_transcript.json 转录(含时间戳) + 01-xxx-v1_审查稿.md 审查稿(时间戳驱动) +``` + +## 核心原则 + +1. **时间戳驱动**:删除任务直接标注 `(start-end)` +2. **逐token分析**:口误需逐token查时间戳 +3. **删前面保后面**:重复时删前面版本 diff --git a/videocut-clip-oral/SKILL.md b/videocut-clip-oral/SKILL.md new file mode 100644 index 0000000..e0fd817 --- /dev/null +++ b/videocut-clip-oral/SKILL.md @@ -0,0 +1,131 @@ +--- +name: videocut-clip-oral +description: 口播视频转录和口误识别。生成审查稿和删除任务清单。触发词:剪口播、处理视频、识别口误 +metadata: + version: "1.0.0" + alias: "videocut:剪口播" +--- + + + +# 剪口播 + +> 转录 + 口误/静音识别 → 生成审查稿 + +## 快速使用 + +``` +用户: 帮我剪这个口播视频 +用户: 处理一下这个视频 +``` + +## 流程 + +``` +1. FunASR 30s 分段转录(字符级时间戳) + ↓ +2. 识别口误(逐句检查) + ↓ +3. 识别微口误(VAD 检测短片段) + ↓ +4. 识别语气词(嗯/哎/诶 等) + ↓ +5. 识别静音(≥1s) + ↓ +6. 生成审查稿(时间戳驱动) + ↓ +7. 输出删除任务 TodoList + ↓ +【等待用户确认】→ 用户确认后,执行 /videocut:剪辑 +``` + +### ⚠️ 为什么用 30s 分段 + +FunASR 长视频有时间戳漂移,30s 分段可避免。 + +## 进度 TodoList + +启动时创建: + +``` +- [ ] 读取「转录最佳实践」→ 转录视频 +- [ ] 读取「口误识别方法论」→ 识别口误 +- [ ] VAD 检测微口误(短片段 < 0.5s) +- [ ] 扫描语气词(嗯/哎/诶 等) +- [ ] 识别静音(≥1s) +- [ ] 生成审查稿 +- [ ] 输出删除任务清单 +``` + +### ⚠️ 必须先读方法论再执行 + +| 阶段 | 先读 | 再执行 | +|------|------|--------| +| 转录 | `tips/转录最佳实践.md` | 调用ASR | +| 识别口误 | `tips/口误识别方法论.md` | 逐句分析 | + +--- + +## 核心:时间戳驱动 + +### 删除任务格式 + +每项**必须标注精确时间戳** `(start-end)`: + +``` +口误(N处): +- [ ] 1. `(start-end)` 删"错误文本" → 保留"正确文本" + +语气词(N处): +- [ ] 1. `(前字end-后字start)` 删"嗯" 上下文: XX【嗯】YY + +静音(N处): +- [ ] 1. `(start-end)` 静音Xs +``` + +### 口误类型 + +| 类型 | 示例 | 删除策略 | +|------|------|----------| +| 重复型 | `拉满新拉满` | 只删差异("新") | +| 替换型 | `AI就是AI就会` | 删第一个完整版本("AI就是") | +| 卡顿型 | `听会会` | 删第一个重复字 | + +### ⚠️ 关键规则 + +1. **时间戳驱动**:审查稿直接标注时间戳,剪辑不再搜索文本 +2. **逐token分析**:对于"删前面保后面"的口误,必须逐token查时间戳 +3. **检查时间跨度**:如果口误时间跨度 > 2秒,必有静音,需拆分 + +--- + +## 输出文件 + +``` +01-xxx-v1_transcript.json # 转录结果(含时间戳) +01-xxx-v1_审查稿.md # 口误审查稿 +``` + +### 展示要求 + +生成审查稿后,**必须展示给用户**: +1. 写入文件 `01-xxx-v1_审查稿.md` +2. 读取并展示内容 +3. 等待用户确认要删除哪些项目 + +--- + +## 方法论 + +详见 `tips/口误识别方法论.md`: +- 口误识别方法(逐句检查) +- "删前面保后面"的精确处理 +- FunASR 时间戳对齐规则 diff --git a/videocut-clip-oral/tips/口误识别方法论.md b/videocut-clip-oral/tips/口误识别方法论.md new file mode 100644 index 0000000..8f70128 --- /dev/null +++ b/videocut-clip-oral/tips/口误识别方法论.md @@ -0,0 +1,270 @@ + + +# 口误识别方法论 + +## 一、识别方法 + +### ❌ 不要用正则匹配 + +### ✅ 逐段阅读,逐句检查 + +每句话问自己: +1. **句子完整吗?** - 残句 = 口误 +2. **有重复吗?** - 词语/短语/句子重复 +3. **通顺吗?** - 不通顺 = 说错了 + +### 口误特征 + +| 特征 | 判断 | +|------|------| +| 残句(没说完) | 一定是口误 | +| 同一内容说两遍 | 删前面保后面 | +| 语义不通顺 | 说错了重说 | +| 多余的字词 | 卡顿/口吃 | + +--- + +## 二、口误类型与删除策略 + +| 类型 | 示例 | 删除策略 | +|------|------|----------| +| 重复型 | `拉满新拉满` | 只删差异部分("新") | +| 替换型 | `AI就是AI就会` | 删第一个完整版本("AI就是") | +| 卡顿型 | `听会会` | 删第一个重复字 | + +### ⚠️ 精确处理步骤 + +当遇到 `🔴~~错误版本~~正确版本` 时: + +**步骤1**:识别口误类型(重复/替换/卡顿) + +**步骤2**:逐token查时间戳 +```python +for i in range(start_idx, end_idx): + gap = timestamps[i][0] - timestamps[i-1][1] if i > 0 else 0 + print(f"[{i}] {tokens[i]} @ {timestamps[i][0]/1000:.2f}s" + + (f" ⚠️ gap={gap/1000:.1f}s" if gap > 500 else "")) +``` + +**步骤3**:确定精确删除边界 +- 时间跨度 > 2秒 → 必有静音,需拆分 +- 替换型:删第一个完整版本 +- 重复型:只删差异token + +### 案例库 + +| 口误 | 错误处理 | 正确处理 | +|------|----------|----------| +| `拉满新拉满` | 删40.84-47.69s | 删41.35-47.69s(静音+新) | +| `AI就是AI就会` | 只删"就是" | 删"AI就是"(85.78-86.50s) | +| `听会会` | 删整个"听会会" | 只删第一个"会" | + +--- + +## 三、微口误检测(2026-01-15 新增) + +### 什么是微口误 + +不成词的卡顿音,转录检测不到,但听起来不舒服。 + +| 类型 | 示例 | 特点 | +|------|------|------| +| 起音卡顿 | "呃...你以为" | 开头有杂音 | +| 吞字 | "钉钉A...钉钉AI" | 说到一半重来 | +| 气口声 | 吸气/呼气声 | 换气太重 | + +### VAD 检测方法 + +用 FunASR 的 VAD(语音活动检测)找出所有发声片段: + +```python +from funasr import AutoModel +import subprocess + +# 提取音频 +subprocess.run(['ffmpeg', '-y', '-i', 'video.mp4', '-ss', '0', '-t', '5', + '-vn', '-acodec', 'pcm_s16le', '-ar', '16000', '-ac', '1', + '/tmp/check.wav'], capture_output=True) + +# VAD 检测 +vad_model = AutoModel(model="fsmn-vad", disable_update=True) +result = vad_model.generate(input="/tmp/check.wav") + +for item in result: + if 'value' in item: + for start_ms, end_ms in item['value']: + print(f"[{start_ms/1000:.3f}s - {end_ms/1000:.3f}s] 语音活动") +``` + +### 输出示例 + +``` +[0.000s - 0.430s] 语音活动 ← 微口误!(0.71s才是正式语音) +[0.710s - 4.980s] 语音活动 ← 正式语音 +``` + +### 判断规则 + +| 模式 | 判断 | +|------|------| +| 短语音 + 长静音 + 长语音 | 第一个短语音是微口误 | +| 语音时长 < 0.5s 且后面有静音 | 可能是微口误 | + +--- + +## 三.5、语气词检测(2026-01-15 新增) + +### 什么是语气词 + +转录能识别,但属于多余的过渡音,删除后更流畅。 + +| 类型 | 示例 | 特点 | +|------|------|------| +| 过渡词 | "嗯"、"啊" | 句子之间的填充 | +| 感叹词 | "哎"、"诶"、"唉" | 无意义感叹 | +| 犹豫音 | "呃"、"额" | 思考时发出 | + +### 检测方法 + +```python +# 语气词列表 +filler_words = ['嗯', '啊', '哎', '诶', '呃', '额', '唉', '哦', '噢', '呀', '欸'] + +# 扫描转录结果 +for i, item in enumerate(all_chars): + if item['char'] in filler_words: + # 获取上下文 + prev_char = all_chars[i-1] if i > 0 else None + next_char = all_chars[i+1] if i < len(all_chars)-1 else None + + print(f"[{item['start']:.2f}s] \"{item['char']}\"") + print(f" 上下文: {prev_char['char']}【{item['char']}】{next_char['char']}") +``` + +### ⚠️ 删除边界要精确 + +``` +错误:删语气词的时间戳 (语气词.start - 语气词.end) + → 可能删掉前面字的尾音 + +正确:从前一个字的 end 到后一个字的 start + → (前字.end - 后字.start) 包含静音+语气词 +``` + +### 案例模式 + +| 上下文 | 错误删除 | 正确删除 | +|--------|----------|----------| +| "A [静音] 语气词 B" | 只删语气词时间戳 | 删 "A.end - B.start" | +| "A 语气词 B" | 删语气词时间戳 | 删 "A.end - B.start" | + +--- + +## 四、静音段 + +### 阈值规则 + +| 静音长度 | 处理 | +|---------|------| +| < 1秒 | **忽略** - 自然停顿 | +| ≥ 1秒 | 建议删除 | +| 开头 > 1秒 | 建议删除 | + +### 识别方法 + +```python +# 开头静音 +if timestamps[0][0] > 1000: + print(f"开头静音 {timestamps[0][0]/1000:.1f}s") + +# 句间静音 +for i in range(len(timestamps) - 1): + gap = timestamps[i+1][0] - timestamps[i][1] + if gap >= 1000: + print(f"静音 {gap/1000:.1f}s @ {timestamps[i][1]/1000:.1f}s") +``` + +--- + +## 五、审查稿格式(时间戳驱动) + +```markdown +# 口误审查稿 + +视频:01-xxx-v1.mp4 + +--- + +【正文】 + +⏸️[1.36s @ 0.00-1.36] + +🔴~~你以为今天a~~(1.36-2.54)你以为钉钉AI录音卡... + +--- + +【删除任务清单】 + +**口误(3处)**: +- [ ] 1. `(1.36-2.54)` 删"你以为今天a" → 保留"你以为钉钉" +- [ ] 2. `(47.55-47.69)` 删"新" → 保留"拉满" +- [ ] 3. `(85.78-86.50)` 删"AI就是" → 保留"AI就会" + +**静音(2处)**: +- [ ] 1. `(41.35-47.55)` 静音6.2s +- [ ] 2. `(0.00-1.36)` 开头静音 +``` + +**关键**: +- 删除项用 `(start-end)` 格式 +- 剪辑脚本直接用时间戳,不搜索文本 + +--- + +## 六、FunASR 时间戳对齐 + +### Token规则 + +| 元素 | 时间戳 | +|------|--------| +| 中文字符 | 每字1个 | +| 英文单词 | 整词1个(`agent` 算1个) | +| 标点/空格 | 无时间戳 | + +### Token化函数 + +```python +def tokenize(text): + tokens = [] + i = 0 + while i < len(text): + char = text[i] + if '\u4e00' <= char <= '\u9fff': # 中文 + tokens.append(char) + i += 1 + elif char.isascii() and char.isalpha(): # 英文 + word = '' + while i < len(text) and text[i].isascii() and text[i].isalpha(): + word += text[i] + i += 1 + tokens.append(word) + else: + i += 1 + return tokens +``` + +--- + +## 七、验证清单 + +- [ ] 删的是前面版本? +- [ ] 保留的文本完整通顺? +- [ ] 删除后不会产生新的重复? +- [ ] 时间戳精确到小数点后两位? diff --git a/videocut-clip-oral/tips/转录最佳实践.md b/videocut-clip-oral/tips/转录最佳实践.md new file mode 100644 index 0000000..3302610 --- /dev/null +++ b/videocut-clip-oral/tips/转录最佳实践.md @@ -0,0 +1,348 @@ + + +# 转录最佳实践 + +--- + +## 零、环境准备 + +### 0.1 安装依赖 + +```bash +pip install funasr +pip install modelscope # 模型下载 +``` + +### 0.2 模型下载 + +首次运行会自动下载模型到 `~/.cache/modelscope/`(约 2GB): + +| 模型 | 大小 | 用途 | +|------|------|------| +| paraformer-zh | 953MB | 语音识别(带时间戳) | +| punc_ct | 1.1GB | 标点预测 | +| fsmn-vad | 4MB | 语音活动检测 | + +**手动预下载**(可选,避免首次运行等待): + +```python +from funasr import AutoModel + +# 运行一次即可触发下载 +model = AutoModel( + model="paraformer-zh", + vad_model="fsmn-vad", + punc_model="ct-punc", +) +print("模型下载完成") +``` + +### 0.3 验证安装 + +```python +from funasr import AutoModel +model = AutoModel(model="paraformer-zh", disable_update=True) +result = model.generate(input="test.wav") +print(result) # 应该输出转录结果 +``` + +--- + +## 一、技术选型 + +### FunASR Paraformer + +阿里开源,中文识别最优,支持字符级时间戳。 + +### ⚠️ 关键发现(2026-01-15) + +| 方案 | 问题 | +|------|------| +| FunASR 全视频 | 长视频时间戳漂移(~10s/3分钟)→ 剪辑不准 | +| **FunASR 30s分段** | ✅ 无漂移 + 精确时间戳 | + +**结论**:口播剪辑用 **FunASR 30s 分段转录** + +--- + +## 二、音频预处理 + +### 2.1 从视频提取音频 + +```bash +ffmpeg -i video.mp4 \ + -vn \ # 不要视频 + -acodec pcm_s16le \ # 16-bit PCM + -ar 16000 \ # 16kHz 采样率 + -ac 1 \ # 单声道 + output.wav +``` + +### 2.2 参数说明 + +| 参数 | 值 | 原因 | +|------|-----|------| +| 采样率 | 16000 Hz | FunASR 模型训练采样率 | +| 声道 | 单声道 | 语音识别不需要立体声 | +| 格式 | WAV | 无损,兼容性好 | +| 位深 | 16-bit | 足够,文件更小 | + +--- + +## 三、FunASR 使用 + +### 3.1 ⭐ 推荐:30s 分段转录(口播剪辑用这个) + +```python +from funasr import AutoModel +import subprocess +import os + +video = "video.mp4" +segment_len = 30 # 30秒一段 +duration = 217.97 # 视频时长(用 ffprobe 获取) + +model = AutoModel(model="paraformer-zh", disable_update=True) +all_chars = [] + +num_segments = int(duration // segment_len) + 1 +for i in range(num_segments): + start = i * segment_len + dur = min(segment_len, duration - start) + wav = f'/tmp/seg_{i}.wav' + + # 提取音频段 + subprocess.run(['ffmpeg', '-y', '-i', video, '-ss', str(start), '-t', str(dur), + '-vn', '-acodec', 'pcm_s16le', '-ar', '16000', '-ac', '1', wav], + capture_output=True) + + # FunASR 转录(字符级时间戳) + result = model.generate(input=wav, return_raw_text=True, + timestamp_granularity="character") + + for item in result: + if 'timestamp' in item and 'text' in item: + text = item['text'].replace(' ', '') + for char, ts in zip(text, item['timestamp']): + all_chars.append({ + 'char': char, + 'start': round(start + ts[0] / 1000, 2), # 加偏移! + 'end': round(start + ts[1] / 1000, 2) + }) + os.remove(wav) +``` + +**关键点**: +- 30s 分段避免时间戳漂移 +- `timestamp_granularity="character"` 获取字符级时间戳 +- 每段结果要 **加上段起始偏移** + +### 3.2 基础用法(短视频可用) + +```python +from funasr import AutoModel + +model = AutoModel( + model="paraformer-zh", # 中文模型 + vad_model="fsmn-vad", # 语音活动检测 + punc_model="ct-punc", # 标点预测 +) + +result = model.generate( + input="audio.wav", + batch_size_s=300, # 批处理时长(秒) +) +``` + +### 3.2 输出格式 + +```json +[{ + "key": "audio", + "text": "大家好,我是陈峰。", + "timestamp": [ + [880, 1120], // 第1个字的时间范围 (ms) + [1120, 1360], // 第2个字 + ... + ] +}] +``` + +### 3.3 模型说明 + +| 模型 | 用途 | +|------|------| +| `paraformer-zh` | 中文语音识别主模型 | +| `fsmn-vad` | 检测哪里有人说话 | +| `ct-punc` | 自动添加标点符号 | + +--- + +## 四、输出格式设计 + +### 4.1 详细 JSON 格式 + +```json +{ + "audio_file": "/path/to/audio.wav", + "full_text": "完整转录文本...", + "duration_ms": 935455, + "segments": [ + { + "char": "大", + "start_ms": 880, + "end_ms": 1120 + }, + ... + ], + "raw_result": { /* FunASR 原始输出 */ } +} +``` + +### 4.2 可读 TXT 格式 + +``` +====================================== +视频转录结果 - video.mp4 +====================================== + +总时长: 15:35 (15分35秒) +字符数: 2006 + +====================================== +完整文本 +====================================== + +大家好,我是陈峰。一直有同学问我... + +====================================== +带时间戳的句子记录 +====================================== + +[00:01 - 00:02] +大家好,我是陈峰。 + +[00:05 - 00:17] +一直有同学问我能不能做一期企业级PPT模板的教程? +``` + +--- + +## 五、常见问题 + +### Q0: 调用方式错误(2026-01-13) + +**错误**:尝试用命令行 `funasr --input video.mp4` 调用 +**正确**:只能用 Python API + +```python +# ❌ 错误 - 没有 funasr CLI +subprocess.run(['funasr', '--input', 'video.mp4']) + +# ✅ 正确 - 用 Python API +from funasr import AutoModel +model = AutoModel(model="paraformer-zh", ...) +result = model.generate(input="video.mp4") +``` + +### Q0.5: 模型选错没有时间戳(2026-01-13) + +**错误**:用 `SenseVoiceSmall` 模型,只输出文本没有时间戳 +**正确**:必须用 `paraformer-zh` 才有字符级时间戳 + +```python +# ❌ 错误 - 没有时间戳 +model = AutoModel(model="iic/SenseVoiceSmall", ...) + +# ✅ 正确 - 有时间戳 +model = AutoModel( + model="paraformer-zh", # 这个才有时间戳! + vad_model="fsmn-vad", + punc_model="ct-punc", +) +``` + +### Q1: 模型下载慢 + +首次运行会下载 ~1GB 模型到 `~/.cache/modelscope/` + +**解决**: +- 使用国内镜像 +- 或手动下载后放到缓存目录 + +### Q2: 时间戳和文字对不上 + +**原因**:标点符号没有时间戳,需要特殊处理 + +**解决**: +```python +# 去掉标点后再对齐 +import re +text_no_punc = re.sub(r'[,。!?、;:]', '', text) +``` + +### Q2.5: 时间戳数量少于字符数(2026-01-13) + +**现象**:纯字符数828,时间戳数763,末尾67个字符没有时间戳 + +**原因**:FunASR 对视频末尾部分可能丢失时间戳 + +**解决**: +```python +# 访问时间戳要兜底 +if idx < len(timestamps): + ts = timestamps[idx] +else: + ts = timestamps[-1] # 用最后一个时间戳兜底 +``` + +### Q2.6: 正则表达式漏掉英文标点(2026-01-13) + +**现象**:搜索文本时位置偏移,因为 clean_text 里还有英文标点 + +**原因**:正则只移除中文标点,没处理英文 `,` `.` 等 + +**解决**: +```python +# ❌ 错误 - 只有中文标点 +clean = re.sub(r'[,。!?、;:]', '', text) + +# ✅ 正确 - 包含英文标点 +clean = re.sub(r'[,。?!、:;""''()《》【】\s\.,!?;:\'"()]', '', text) +``` + +### Q3: 长视频处理慢 + +**解决**: +- 增大 `batch_size_s` 参数 +- 使用 GPU 加速(需要 PyTorch CUDA) + +### Q4: 识别准确率低 + +**可能原因**: +- 背景噪音太大 +- 说话人口音重 +- 音频采样率不对 + +**解决**: +- 预处理降噪 +- 确保 16kHz 采样率 + +--- + +## 六、性能参考 + +| 指标 | 值 | +|------|-----| +| RTF (Real-Time Factor) | ~0.16 | +| 含义 | 1秒音频只需0.16秒处理 | +| 15分钟视频 | 约2.5分钟处理完 | + +*测试环境:M1 Mac,CPU 推理* diff --git a/videocut-clip/README.md b/videocut-clip/README.md new file mode 100644 index 0000000..16e1337 --- /dev/null +++ b/videocut-clip/README.md @@ -0,0 +1,23 @@ + + +# videocut:剪辑 + +> FFmpeg 视频剪辑执行 skill + +## 文件清单 + +| 文件 | 用途 | +|------|------| +| `SKILL.md` | skill 定义(剪辑流程) | +| `README.md` | 本文件 | + +## 依赖 + +- FFmpeg(`brew install ffmpeg`) + +## 输入输出 + +- **输入**:审查稿 + 原始视频 +- **输出**:剪辑后视频 (v2.mp4) diff --git a/videocut-clip/SKILL.md b/videocut-clip/SKILL.md new file mode 100644 index 0000000..629d01c --- /dev/null +++ b/videocut-clip/SKILL.md @@ -0,0 +1,154 @@ +--- +name: videocut-clip +description: 执行视频剪辑。根据确认的删除任务执行FFmpeg剪辑,循环直到零口误,生成字幕。触发词:执行剪辑、开始剪、确认剪辑 +metadata: + version: "1.0.0" + alias: "videocut:剪辑" +--- + + + +# 剪辑 + +> 执行删除 → 重新审查 → 循环直到零口误 → 生成字幕 + +## 快速使用 + +``` +用户: 确认,执行剪辑 +用户: 全删 +用户: 保留静音3和5,其他都删 +``` + +## 前置条件 + +需要先执行 `/videocut:剪口播` 生成删除任务 TodoList + +## 流程 + +``` +1. 读取用户确认的删除任务 + ↓ +2. 计算保留时间段 + ↓ +3. 生成 FFmpeg filter_complex + ↓ +4. 执行剪辑 + ↓ +5. 重新转录 + 审查 ←───┐ + ↓ │ + 有口误? ──是─────────┘ + ↓ 否 +6. 生成字幕(SRT) + ↓ +7. 完成 +``` + +## 进度 TodoList + +启动时创建: + +``` +- [ ] 确认删除任务 +- [ ] 执行 FFmpeg 剪辑 +- [ ] 重新转录审查 +- [ ] 生成字幕 +``` + +循环时更新版本号(v2→v3→...) + +--- + +## 一、读取删除任务(时间戳驱动) + +从 `/videocut:剪口播` 输出的 TodoList 读取。**直接使用时间戳,不要搜索文本**: + +``` +口误(N处): +- [x] 1. `(start-end)` 删"错误文本" → 保留"正确文本" ← 勾选=删除 + +语气词(N处): +- [x] 1. `(前字end-后字start)` 删"嗯" ← 勾选=删除 + +静音(N处): +- [x] 1. `(start-end)` 静音Xs ← 勾选=删除 +- [ ] 2. `(start-end)` 静音Xs ← 未勾选=保留 +``` + +### ⚠️ 关键规则 + +1. **直接用时间戳**:从 `(start-end)` 解析,不要搜索文本 +2. **不要重新搜索**:审查稿已经计算好精确时间戳 +3. 勾选 = 删除,未勾选 = 保留 + +--- + +## 二、FFmpeg 命令 + +```bash +ffmpeg -y -i input.mp4 \ + -filter_complex_script filter.txt \ + -map "[outv]" -map "[outa]" \ + -c:v libx264 -crf 18 -c:a aac \ + output.mp4 +``` + +### filter.txt 格式 + +``` +[0:v]trim=start=0:end=1.36,setpts=PTS-STARTPTS[v0]; +[0:a]atrim=start=0:end=1.36,asetpts=PTS-STARTPTS[a0]; +[0:v]trim=start=2.54:end=10.5,setpts=PTS-STARTPTS[v1]; +... +[v0][a0][v1][a1]...concat=n=N:v=1:a=1[outv][outa] +``` + +--- + +## 三、重新转录审查 + +剪辑后必须: +1. 用 FunASR 重新转录 +2. 检查是否还有口误 +3. 有 → 回到 `/videocut:剪口播` 重新识别 +4. 无 → 生成字幕 + +--- + +## 四、输出文件 + +``` +01-xxx-v2.mp4 # 剪辑后视频 +01-xxx-v2_transcript.json # 重新转录(验证用) +01-xxx-v2.srt # 字幕文件 +``` + +版本递增:v1→v2→v3... + +--- + +## 五、反馈记录 + +### 2026-01-15 +- **语气词删除边界不精确**:删语气词时把前面的字也删了 + - 原因:直接用语气词的时间戳删除 + - 正确:从前一字 end 到后一字 start +- **语气词 + 静音要一起删**:`A [静音] 语气词 B` 要删整段 (A.end - B.start) +- **教训**:删除语气词时,边界是 `前一字.end` 到 `后一字.start` + +### 2026-01-14 +- 口误文字没删干净,只删了静音段 +- 教训:直接从 TodoList 读取时间戳,不要重新查找 +- **"拉满新"删成了"会的时候"**:搜索"拉满新"时间戳跨度7秒(含6秒静音),把"拉满"也删了 + - 教训:对于"删前面保后面"的口误,只删差异部分 +- **"AI就是AI"出现两次AI**:只删了"就是",没删第一个"AI" + - 教训:替换型口误必须删完整的第一个版本 +- **系统性解决**:时间戳驱动,审查稿直接标注 `(start-end)`,剪辑脚本不再搜索文本 diff --git a/videocut-install/README.md b/videocut-install/README.md new file mode 100644 index 0000000..09f0a3c --- /dev/null +++ b/videocut-install/README.md @@ -0,0 +1,20 @@ + + +# videocut:安装 + +> 环境准备 skill + +## 文件清单 + +| 文件 | 用途 | +|------|------| +| `SKILL.md` | skill 定义(安装流程) | +| `README.md` | 本文件 | + +## 依赖 + +- Python 3.8+ +- pip +- brew (macOS) / apt (Linux) diff --git a/videocut-install/SKILL.md b/videocut-install/SKILL.md new file mode 100644 index 0000000..356bdf2 --- /dev/null +++ b/videocut-install/SKILL.md @@ -0,0 +1,161 @@ +--- +name: videocut-install +description: 环境准备。安装依赖、下载模型、验证环境。触发词:安装、环境准备、初始化 +metadata: + version: "1.0.0" + alias: "videocut:安装" +--- + + + +# 安装 + +> 首次使用前的环境准备 + +## 快速使用 + +``` +用户: 安装环境 +用户: 初始化 +用户: 下载模型 +``` + +## 依赖清单 + +| 依赖 | 用途 | 安装命令 | +|------|------|----------| +| funasr | 口误识别 | `pip install funasr` | +| modelscope | 模型下载 | `pip install modelscope` | +| openai-whisper | 字幕生成 | `pip install openai-whisper` | +| ffmpeg | 视频剪辑 | `brew install ffmpeg` | + +## 模型清单 + +### FunASR 模型(口误识别用) + +首次运行自动下载到 `~/.cache/modelscope/`: + +| 模型 | 大小 | 用途 | +|------|------|------| +| paraformer-zh | 953MB | 语音识别(带时间戳) | +| punc_ct | 1.1GB | 标点预测 | +| fsmn-vad | 4MB | 语音活动检测 | +| **小计** | **~2GB** | | + +### Whisper 模型(字幕生成用) + +首次运行自动下载到 `~/.cache/whisper/`: + +| 模型 | 大小 | 用途 | +|------|------|------| +| large-v3 | 2.9GB | 字幕转录(质量最好) | + +### 总计 + +约 **5GB** 模型文件 + +## 安装流程 + +``` +1. 安装 Python 依赖 + ↓ +2. 安装 FFmpeg + ↓ +3. 下载 FunASR 模型(口误识别) + ↓ +4. 下载 Whisper 模型(字幕生成) + ↓ +5. 验证环境 +``` + +## 执行步骤 + +### 1. 安装 Python 依赖 + +```bash +pip install funasr modelscope openai-whisper +``` + +### 2. 安装 FFmpeg + +```bash +# macOS +brew install ffmpeg + +# Ubuntu +sudo apt install ffmpeg + +# 验证 +ffmpeg -version +``` + +### 3. 下载 FunASR 模型(约2GB) + +```python +from funasr import AutoModel + +model = AutoModel( + model="paraformer-zh", + vad_model="fsmn-vad", + punc_model="ct-punc", +) +print("FunASR 模型下载完成") +``` + +### 4. 下载 Whisper 模型(约3GB) + +```python +import whisper + +model = whisper.load_model("large-v3") +print("Whisper 模型下载完成") +``` + +### 5. 验证环境 + +```python +from funasr import AutoModel + +model = AutoModel( + model="paraformer-zh", + vad_model="fsmn-vad", + punc_model="ct-punc", + disable_update=True +) + +# 测试转录(用任意音频/视频) +result = model.generate(input="test.mp4") +print("文本:", result[0]['text'][:50]) +print("时间戳数量:", len(result[0]['timestamp'])) +print("✅ 环境就绪") +``` + +## 常见问题 + +### Q1: 模型下载慢 + +**解决**:使用国内镜像或手动下载 + +### Q2: ffmpeg 命令找不到 + +**解决**:确认已安装并添加到 PATH + +```bash +which ffmpeg # 应该输出路径 +``` + +### Q3: funasr 导入报错 + +**解决**:检查 Python 版本(需要 3.8+) + +```bash +python3 --version +``` diff --git a/videocut-self-update/README.md b/videocut-self-update/README.md new file mode 100644 index 0000000..5189a98 --- /dev/null +++ b/videocut-self-update/README.md @@ -0,0 +1,23 @@ + + +# videocut:自更新 + +> Agent 自进化机制 + +## 文件清单 + +| 文件 | 地位 | 功能 | +|------|------|------| +| `SKILL.md` | 入口 | 流程说明、触发词 | +| `README.md` | 索引 | 本文件 | + +## 作用范围 + +``` +videocut:自更新 可以修改: + +├── /CLAUDE.md # 用户画像 +└── /*/tips/*.md # 各 skill 的方法论 +``` diff --git a/videocut-self-update/SKILL.md b/videocut-self-update/SKILL.md new file mode 100644 index 0000000..c1c0504 --- /dev/null +++ b/videocut-self-update/SKILL.md @@ -0,0 +1,118 @@ +--- +name: videocut-self-update +description: 自更新 skills。记录用户反馈,更新方法论和规则。触发词:更新规则、记录反馈、改进skill +metadata: + version: "1.0.0" + alias: "videocut:自更新" +--- + + + +# 自更新 + +> 让 Agent 从错误中学习,持续改进 + +## 快速使用 + +``` +用户: 记录一下刚才的问题 +用户: 更新口误识别的规则 +用户: 这个教训要记下来 +``` + +## 更新位置 + +| 内容类型 | 目标文件 | 示例 | +|---------|---------|------| +| 用户画像 | `CLAUDE.md` | 偏好、习惯 | +| 方法论 + 反馈 | `*/tips/*.md` | 规则、教训 | + +## 流程 + +``` +用户触发("刚才失败了"、"记录一下") + ↓ +【自动】回溯上下文,找出问题点 + ↓ +【自动】读目标文件全文,理解现有结构 + ↓ +【自动】整合到正文相应位置(不是只往末尾加!) + ↓ +【自动】反馈记录只记事件,不重复规则 + ↓ +汇报更新结果 +``` + +**关键**:不要问"什么问题",直接从上下文分析! + +## 更新原则 + +### ❌ 错误做法:往末尾加 + +```markdown +## 反馈记录 +### 2026-01-14 +- 教训:审查稿末尾必须生成删除任务清单 +- 教训:用户确认时要分别确认口误和静音 +``` + +只加到反馈记录 = 规则散落在末尾,下次还会犯错 + +### ✅ 正确做法:整合到正文 + +1. **读全文**,理解章节结构 +2. **找到相应位置**,把规则整合进去 +3. **反馈记录只记事件**:`- 审查稿标记了静音,但剪辑时漏删` + +```markdown +## 四、审查稿格式 +(新增删除任务清单模板) + +## 五、确认与执行流程 ← 缺这个章节就新增 +(新增分别确认口误和静音的流程) + +## 反馈记录 +### 2026-01-14 +- 审查稿标记了静音,但剪辑时漏删(只删了口误) +``` + +## 触发条件 + +- 用户纠正 AI 错误 +- 用户说"记住这个"、"以后注意" +- 发现新的通用规律 + +## 反例 + +### 2026-01-13 +``` +❌ 错误: +用户: 刚才失败了,更新到skills +AI: 请告诉我你发现了什么问题? ← 不该问! + +✅ 正确: +AI: [自动回溯上下文,找到失败点] +AI: [执行更新] +``` + +### 2026-01-14 +``` +❌ 错误: +AI: 已更新,在反馈记录新增3条教训 ← 只加末尾! + +✅ 正确: +AI: [读全文,理解结构] +AI: [整合到正文相应位置] +AI: [反馈记录只记事件] +AI: 已更新:新增第五章"确认与执行流程",更新第四章模板 +``` + +**原则**:规则要整合到正文,反馈记录只是事件日志 diff --git a/videocut-subtitle/README.md b/videocut-subtitle/README.md new file mode 100644 index 0000000..2e5a549 --- /dev/null +++ b/videocut-subtitle/README.md @@ -0,0 +1,26 @@ +# videocut:字幕 + +> 字幕生成与烧录 + +## 文件 + +| 文件 | 作用 | +|------|------| +| `SKILL.md` | 流程定义 | +| `词典.txt` | 正确写法列表(每行一个) | + +## 词典格式 + +``` +skills +Claude +钉钉AI录音卡 +``` + +用户只写正确的词,我识别所有错误变体。 + +## 流程 + +``` +转录 → 词典纠错 → 用户审核 → 烧录 +``` diff --git a/videocut-subtitle/SKILL.md b/videocut-subtitle/SKILL.md new file mode 100644 index 0000000..e07d5c1 --- /dev/null +++ b/videocut-subtitle/SKILL.md @@ -0,0 +1,113 @@ +--- +name: videocut-subtitle +description: 字幕生成与烧录。转录→词典纠错→审核→烧录。触发词:加字幕、生成字幕、字幕 +metadata: + version: "1.0.0" + alias: "videocut:字幕" +--- + +# 字幕 + +> 转录 → 纠错 → 审核 → 匹配 → 烧录 + +## 流程 + +``` +1. 转录视频(Whisper) + ↓ +2. 词典纠错 + 分句 + ↓ +3. 输出字幕稿(纯文本,一句一行) + ↓ +【用户审核修改】 + ↓ +4. 用户给回修改后的文本 + ↓ +5. 我匹配时间戳 → 生成 SRT + ↓ +6. 烧录字幕(FFmpeg) +``` + +## 转录 + +使用 OpenAI Whisper 模型进行语音转文字: + +```bash +whisper video.mp4 --model medium --language zh --output_format json +``` + +| 模型 | 用途 | +|------|------| +| `medium` | 默认,平衡速度与准确率 | +| `large-v3` | 高精度,较慢 | + +输出 JSON 包含逐词时间戳,用于后续 SRT 生成。 + +--- + +## 字幕规范 + +| 规则 | 说明 | +|------|------| +| 一屏一行 | 不换行,不堆叠 | +| ≤15字/行 | 超过15字必须拆分(4:3竖屏) | +| 句尾无标点 | `你好` 不是 `你好。` | +| 句中保留标点 | `先点这里,再点那里` | + +--- + +## 词典纠错 + +读取 `词典.txt`,每行一个正确写法: + +``` +skills +Claude +iPhone +``` + +我自动识别变体:`claude` → `Claude` + +--- + +## 字幕稿格式 + +**我给用户的**(纯文本,≤15字/行): + +``` +今天给大家分享一个技巧 +很多人可能不知道 +其实这个功能 +藏在设置里面 +你只要点击这里 +就能看到了 +``` + +**用户修改后给回我**,我再匹配时间戳生成 SRT。 + +--- + +## 样式 + +默认:24号白字、黑色描边、底部居中 + +**可选样式:** +| 样式 | 说明 | +|------|------| +| 默认 | 白字黑边 | +| 黄字 | 黄字黑边(醒目) | + +用户可说: +- "字大一点" → 32号 +- "放顶部" → 顶部居中 +- "黄色字幕" → 黄字黑边 + +--- + +## 输出 + +``` +01-xxx_字幕稿.txt # 纯文本,用户编辑 +01-xxx.srt # 字幕文件 +01-xxx-字幕.mp4 # 带字幕视频 +``` diff --git a/videocut-subtitle/词典.txt b/videocut-subtitle/词典.txt new file mode 100644 index 0000000..ef29145 --- /dev/null +++ b/videocut-subtitle/词典.txt @@ -0,0 +1,8 @@ +skills +Claude +iPhone +GitHub +API +APP +NO.1 +钉钉