Initial commit: skills library

- 70 skills with code and documentation - Add .gitignore (ignore __pycache__, output/, temp/, venv/) - Clean up test intermediates and caches
2026-04-26 19:27:40 +08:00
commit 04db423416
861 changed files with 210414 additions and 0 deletions
@@ -0,0 +1,160 @@
+---
+name: musicXML-ocr
+description: 乐谱图片 OCR 技能。将五线谱图片（PNG/JPG/PDF）识别并转换为 MusicXML 格式。使用 Audiveris 进行光学音乐识别（OMR）。当用户提到"识别乐谱"、"图片转 MusicXML"、"打谱"、"手抄谱转数字"时触发此技能。
+---
+
+# MusicXML OCR Skill
+
+## 功能概述
+
+将五线谱图片（PNG/JPG）或 PDF 通过 OCR 识别转换为 MusicXML 格式，支持：
+- 扫描/拍摄的手拍乐谱图片 → MusicXML
+- PDF 中的乐谱页面 → MusicXML
+- 音符、节拍、调号识别
+- 钢琴谱 grand staff 结构
+- 输出可在 MuseScore、Finale 等软件打开
+
+## 触发场景
+
+- "识别这张乐谱"
+- "把图片转成 MusicXML"
+- "帮我打谱"
+- "手抄谱数字化"
+- "识别五线谱"
+
+---
+
+## 推荐 Workflow
+
+```
+PDF → 图片提取 → Audiveris 识别 → MuseScore 合并修正
+```
+
+### Step 1: PDF → 图片
+
+如果 PDF 无法被 Audiveris 直接处理，需要先提取图片：
+
+```python
+import fitz  # PyMuPDF
+
+pdf_path = r"input.pdf"
+output_dir = r"temp\sheets"
+os.makedirs(output_dir, exist_ok=True)
+
+doc = fitz.open(pdf_path)
+for page_num in range(len(doc)):
+    page = doc[page_num]
+    images = page.get_images()
+    for img_idx, img in enumerate(images):
+        xref = img[0]
+        pix = fitz.Pixmap(doc, xref)
+        if pix.n - pix.alpha < 4:
+            pix.save(f'{output_dir}/page{page_num+1}_img{img_idx+1}.png')
+        else:
+            pix1 = fitz.Pixmap(fitz.csRGB, pix)
+            pix1.save(f'{output_dir}/page{page_num+1}_img{img_idx+1}.png')
+doc.close()
+```
+
+### Step 2: Audiveris 识别
+
+```powershell
+# 单张图片识别
+& "C:\Program Files\Audiveris\Audiveris.exe" -batch -export -output "D:\output" "D:\input\sheet.png"
+
+# PDF 直接处理（如果支持）
+& "C:\Program Files\Audiveris\Audiveris.exe" -batch -export -output "D:\output" "D:\input\score.pdf"
+```
+
+**输出文件：**
+- `.omr` - Audiveris 项目文件（可再次打开编辑）
+- `.mxl` / `.xml` - MusicXML 格式
+
+### Step 3: MuseScore 合并与修正
+
+**推荐在 MuseScore 中手动处理：**
+1. 打开第一页的 .mxl 文件
+2. 菜单 → 文件 → 导入 → 选择第二页的 .xml
+3. 或直接复制粘贴各部分
+
+**在 MuseScore 中修正：**
+- 删除错误识别的符号（选中 → Delete）
+- 修正小节号（双击小节号直接编辑）
+- 合并后的文件另存为 .mscz 或 .mxl
+
+---
+
+## 安装 Audiveris
+
+**Windows：**
+```powershell
+winget install --id Audiveris.Audiveris --accept-package-agreements --accept-source-agreements
+```
+
+**手动下载：**
+- https://github.com/Audiveris/audiveris/releases
+
+**路径：**
+```
+C:\Program Files\Audiveris\Audiveris.exe
+```
+
+---
+
+## 方案对比
+
+| 方案 | 准确率 | 可靠性 | 推荐度 |
+|------|--------|--------|--------|
+| **Audiveris → MuseScore 手动合并** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ✅ **推荐** |
+| **Audiveris 直接处理 PDF** | ⭐⭐⭐⭐ | ⭐⭐⭐ | 备选 |
+| **homr** | ⭐⭐⭐ | ⭐⭐⭐ | ⚠️ 备选 |
+
+**经验总结：**
+- Audiveris 识别准确率高（第一小节测试完全正确）
+- homr 存在 dewarping 问题，识别可能完全错误
+- XML 合并不可靠，建议在 MuseScore 中手动合并
+- 分页处理时，每页是独立的乐谱，小节号需要手动修正
+
+---
+
+## 脚本工具
+
+| 脚本 | 功能 | 依赖 |
+|------|------|------|
+| `extract_pdf_images.py` | 从 PDF 提取五线谱图片 | `fitz` (PyMuPDF) |
+| `audiveris_to_musescore.py` | Audiveris 完整流水线 | Audiveris 已安装 |
+
+### 使用方法
+
+```powershell
+# 1. 提取 PDF 图片
+python scripts/extract_pdf_images.py "D:\scores\sheet.pdf" "D:\output\sheets"
+
+# 2. Audiveris 识别
+& "C:\Program Files\Audiveris\Audiveris.exe" -batch -export -output "D:\output" "D:\output\sheets\page1.png"
+
+# 3. 在 MuseScore 中合并修正
+# 打开 .omr 文件，手动修正后导出
+```
+
+---
+
+## 验证清单
+
+处理完成后，检查 MuseScore 中的结果：
+- [ ] 音符位置与原图对应
+- [ ] 小节号连续正确
+- [ ] 钢琴谱上下两行用花括号连接
+- [ ] 无多余/重复的符号
+
+---
+
+## 故障排除
+
+| 问题 | 解决方案 |
+|------|---------|
+| Audiveris 无法处理 PDF | 先用 `extract_pdf_images.py` 提取图片 |
+| 识别有多余符号 | 在 Audiveris 中选中删除，或在 MuseScore 中修正 |
+| 小节号不连续 | 在 MuseScore 中手动修正小节号 |
+| XML 合并出错 | 在 MuseScore 中手动复制粘贴合并 |
+| homr 识别全错 | 改用 Audiveris |
@@ -0,0 +1,3 @@
+# Example Reference
+
+This is an example reference file. Delete if not needed.
@@ -0,0 +1,88 @@
+# homr 使用指南
+
+## 安装状态
+
+✅ 当前环境已预装 `homr` 和 `verovio`。
+
+首次运行会自动下载检测模型（约 500MB），请耐心等待。
+
+## 基本用法
+
+### 命令行
+
+```bash
+# 基本识别
+homr path/to/image.png --output result.musicxml
+
+# 批量处理目录
+homr path/to/folder/
+
+# 指定输出文件名
+homr image.png -o my_song.musicxml
+```
+
+### Python API
+
+```python
+from homr.main import process_image, ProcessingConfig, XmlGeneratorArguments
+
+# 注意：homr 对中文路径支持有问题，建议先用 ASCII 路径
+image_path = r"path/to/sheet_music.png"
+
+config = ProcessingConfig(False, False, False, False, -1)
+xml_args = XmlGeneratorArguments(False, None, None)
+
+result = process_image(image_path, config, xml_args)
+print("识别完成！")
+# 输出文件自动生成：同目录下 .musicxml 文件
+```
+
+## 参数说明
+
+| 参数 | 类型 | 说明 |
+|------|------|------|
+| `model_type` | str | 模型类型：`ctc`（默认）或 `transformer` |
+| `language` | str | 语言：`english`、`german` 等 |
+| `staffline_height` | float | 五线间距（像素），默认自动检测 |
+
+## 输出格式
+
+输出的 MusicXML 包含：
+- `<part>` - 乐器/声部
+- `<measure>` - 小节
+- `<note>` - 音符
+- `<attributes>` - 调号、拍号
+- `<direction>` - 力度、表情记号
+
+## 错误处理
+
+```python
+from homr.main import Homr
+
+try:
+    model = Homr()
+    result = model.predict("image.png")
+except Exception as e:
+    print(f"识别失败: {e}")
+```
+
+## 已知限制
+
+1. 手写体识别效果较差
+2. 复杂和声识别可能不准确
+3. 装饰音识别有限
+4. 首次运行需下载模型（约 500MB）
+5. **中文路径问题**：homr 对中文路径支持有问题，建议使用英文/数字路径
+
+## 已知 Bug 修复
+
+如果遇到 `numpy` 兼容性错误，修改 `autocrop.py`：
+```python
+# 原来的：
+hist = cv2.calcHist([img], [0], None, [256], [0, 256])
+dominant_color_gray_scale = max(enumerate(hist), ...)[0]
+
+# 修复为：
+hist = cv2.calcHist([img], [0], None, [256], [0, 256]).flatten()
+dominant_color_gray_scale = max(enumerate(hist), ...)[0]
+```
@@ -0,0 +1,102 @@
+# 提高识别率技巧
+
+## 图片质量要求
+
+### ✅ 最佳
+- 300+ DPI 扫描件
+- 光线均匀、无阴影
+- 纸张平整无褶皱
+- 音符清晰、线条完整
+
+### ⚠️ 次佳
+- 150 DPI 扫描
+- 手机拍摄但光线好
+- 轻微阴影可接受
+
+### ❌ 避免
+- 50 DPI 以下的模糊图
+- 严重倾斜（可预处理校正）
+- 光线不均导致部分过暗/过亮
+- 铅笔痕迹干扰
+
+## 预处理建议
+
+### Python + PIL 增强
+
+```python
+from PIL import Image, ImageFilter, ImageOps
+import numpy as np
+
+def preprocess_sheet_music(image_path):
+    img = Image.open(image_path).convert('L')  # 灰度
+    
+    # 增加对比度
+    img = ImageOps.autocontrast(img)
+    
+    # 去噪
+    img = img.filter(ImageFilter.MedianFilter(size=3))
+    
+    # 二值化（有时有效）
+    # img = img.point(lambda x: 0 if x < 128 else 255)
+    
+    return img
+
+# 使用
+processed = preprocess_sheet_music("original.jpg")
+processed.save("processed.png")
+```
+
+### 倾斜校正
+
+```python
+from deskew import deskew
+
+# 自动检测并校正倾斜角度
+image = Image.open("tilted.jpg")
+corrected = deskew(image)
+corrected.save("corrected.jpg")
+```
+
+## 格式选择
+
+| 格式 | 推荐度 | 说明 |
+|------|--------|------|
+| PNG | ⭐⭐⭐⭐⭐ | 无损压缩，最佳 |
+| TIFF | ⭐⭐⭐⭐ | 印刷级无损 |
+| JPG (高画质) | ⭐⭐⭐ | 可接受 |
+| JPG (低画质) | ⭐ | 压缩失真严重 |
+
+## 分辨率建议
+
+| 乐谱复杂度 | 最小宽度 | 建议 DPI |
+|-----------|---------|---------|
+| 简单旋律 | 1000px | 150 |
+| 钢琴谱（双手） | 2500px | 300 |
+| 交响乐总谱 | 4000px | 400 |
+
+## 多页处理
+
+```python
+import subprocess
+
+# 将多页 PDF 转为单页图片
+subprocess.run([
+    "pdftoppm", 
+    "-r", "300",      # DPI
+    "-png",           # 输出 PNG
+    "score.pdf",      # 输入
+    "page"            # 输出前缀
+])
+
+# 然后逐页识别
+for i in range(1, page_count + 1):
+    model.predict(f"page-{i}.png")
+```
+
+## 识别后校对
+
+建议在 MuseScore 中打开识别结果，检查：
+1. 节拍是否正确
+2. 升降号是否遗漏
+3. 和弦分解是否正确
+4. 休止符位置
@@ -0,0 +1,70 @@
+# Verovio 渲染预览
+
+## 安装
+
+```bash
+pip install verovio
+```
+
+## 渲染 MusicXML 为图片
+
+```python
+import verovio
+
+toolkit = verovio.toolkit()
+
+# 加载 MusicXML
+toolkit.loadFile("score.musicxml")
+
+# 渲染为 SVG
+svg = toolkit.renderToSVG()
+
+# 保存 SVG
+with open("output.svg", "w") as f:
+    f.write(svg)
+
+# 渲染为 PNG
+toolkit.renderToPNG("output.png", 300)  # 300 DPI
+
+# 渲染为 PDF
+toolkit.renderToPDF("output.pdf")
+```
+
+## 命令行工具
+
+```bash
+# 安装命令行工具
+pip install verovio[commandline]
+
+# 转换
+vrv tool -o output.png input.musicxml -r 300
+```
+
+## 在 Jupyter 中预览
+
+```python
+from IPython.display import SVG, display
+
+toolkit = verovio.toolkit()
+toolkit.loadFile("score.musicxml")
+svg = toolkit.renderToSVG()
+display(SVG(svg))
+```
+
+## 调整页面布局
+
+```python
+toolkit.setOption("pageWidth", 2100)   # 页面宽度（tenths）
+toolkit.setOption("pageHeight", 2970)  # 页面高度
+toolkit.setOption("scale", 50)         # 缩放比例
+toolkit.setOption("spacingSystem", 50) # 五线间距
+```
+
+## 获取 MIDI
+
+```python
+toolkit.loadFile("score.musicxml")
+midi = toolkit.renderToMIDI()
+with open("output.mid", "wb") as f:
+    f.write(midi)
+```
@@ -0,0 +1,3 @@
+# musicXML-ocr - dependencies
+homr>=0.0.1
+xml.etree.ElementTree>=0.0.1
@@ -0,0 +1,48 @@
+#!/usr/bin/env python3
+"""为 MusicXML 钢琴谱添加花括号分组。"""
+import sys
+import xml.etree.ElementTree as ET
+
+if len(sys.argv) < 2:
+    print('Usage: python add_brace.py input.musicxml [output.musicxml]')
+    sys.exit(1)
+
+input_file = sys.argv[1]
+output_file = sys.argv[2] if len(sys.argv) > 2 else input_file.replace('.musicxml', '_braced.musicxml')
+
+ET.register_namespace('m', 'http://www.musescore.org/ns/mscore')
+ET.register_namespace('xlink', 'http://www.w3.org/1999/xlink')
+
+tree = ET.parse(input_file)
+root = tree.getroot()
+part_list = root.find('part-list')
+
+if part_list is None:
+    print('Error: no <part-list> found in file')
+    sys.exit(1)
+
+# 添加花括号分组（插在 part-list 最前面）
+group_start = ET.Element('part-group')
+group_start.set('number', '1')
+group_start.set('type', 'start')
+ET.SubElement(group_start, 'group-symbol').text = 'brace'
+ET.SubElement(group_start, 'group-barline').text = 'bracket'
+ET.SubElement(group_start, 'group-time')
+part_list.insert(0, group_start)
+
+# 每个 score-part 加 <group>1</group>
+for sp in part_list.findall('score-part'):
+    g = ET.SubElement(sp, 'group')
+    g.text = '1'
+
+# 结束分组
+part_list.append(ET.Element('part-group', number='1', type='stop'))
+
+try:
+    ET.indent(root)
+except AttributeError:
+    pass
+
+tree.write(output_file, encoding='UTF-8', xml_declaration=True)
+print(f'Added brace grouping')
+print(f'Output: {output_file}')
@@ -0,0 +1,202 @@
+#!/usr/bin/env python3
+"""
+audiveris_to_musescore.py
+
+使用 Audiveris 进行五线谱 OCR 识别的完整流水线。
+
+功能：
+1. 调用 Audiveris 识别五线谱图片/PDF
+2. 输出 .mxl 格式的 MusicXML 文件
+
+Usage:
+    python audiveris_to_musescore.py <input_image_or_pdf> [output_dir]
+
+Examples:
+    python audiveris_to_musescore.py "D:/scores/sheet.png" "D:/output"
+
+    # 处理 PDF
+    python audiveris_to_musescore.py "D:/scores/piece.pdf" "D:/output"
+
+Dependencies:
+    - Audiveris (https://github.com/Audiveris/audiveris)
+    - Python 3.8+
+
+Installation:
+    # Windows
+    winget install --id Audiveris.Audiveris --accept-package-agreements --accept-source-agreements
+
+    # 或手动下载 MSI: https://github.com/Audiveris/audiveris/releases
+"""
+
+import sys
+import os
+import subprocess
+import tempfile
+import shutil
+import zipfile
+import argparse
+
+
+# Audiveris 安装路径
+AUDIVERIS_PATH = r"C:\Program Files\Audiveris\Audiveris.exe"
+
+
+def find_audiveris() -> str:
+    """查找 Audiveris 可执行文件路径。"""
+    if os.path.exists(AUDIVERIS_PATH):
+        return AUDIVERIS_PATH
+
+    # 尝试常见安装位置
+    possible_paths = [
+        r"C:\Program Files\Audiveris\Audiveris.exe",
+        r"C:\Program Files (x86)\Audiveris\Audiveris.exe",
+        os.path.expanduser(r"~\AppData\Local\Programs\Audiveris\Audiveris.exe"),
+    ]
+
+    for path in possible_paths:
+        if os.path.exists(path):
+            return path
+
+    raise FileNotFoundError(
+        "Audiveris not found. Please install from:\n"
+        "  winget install --id Audiveris.Audiveris\n"
+        "  or download from: https://github.com/Audiveris/audiveris/releases"
+    )
+
+
+def extract_musicxml_from_mxl(mxl_path: str, output_dir: str) -> str:
+    """从 .mxl 文件中提取 MusicXML 内容。"""
+    with zipfile.ZipFile(mxl_path, 'r') as z:
+        for name in z.namelist():
+            if name.endswith('.xml'):
+                xml_path = os.path.join(output_dir, os.path.basename(name))
+                with z.open(name) as src:
+                    content = src.read()
+                with open(xml_path, 'wb') as dst:
+                    dst.write(content)
+                return xml_path
+    return None
+
+
+def process_with_audiveris(input_path: str, output_dir: str) -> dict:
+    """
+    使用 Audiveris 处理输入文件。
+
+    Returns:
+        dict with keys:
+            - mxl_path: path to .mxl output
+            - xml_path: path to extracted .xml (if available)
+            - omr_path: path to .omr project file
+    """
+    if not os.path.exists(input_path):
+        raise FileNotFoundError(f"Input file not found: {input_path}")
+
+    os.makedirs(output_dir, exist_ok=True)
+
+    audiveris_exe = find_audiveris()
+    print(f"[Audiveris] Found at: {audiveris_exe}")
+
+    # 构建命令：-batch -export -output <dir> <input>
+    cmd = [
+        audiveris_exe,
+        '-batch',
+        '-export',
+        '-output', output_dir,
+        input_path
+    ]
+
+    print(f"[Audiveris] Processing: {input_path}")
+    print(f"[Audiveris] Command: {' '.join(cmd)}")
+
+    # 运行 Audiveris
+    result = subprocess.run(
+        cmd,
+        capture_output=True,
+        text=True,
+        encoding='utf-8',
+        errors='replace'
+    )
+
+    if result.returncode != 0:
+        print(f"[Audiveris] STDERR:\n{result.stderr}")
+        raise RuntimeError(f"Audiveris failed with code {result.returncode}")
+
+    print(f"[Audiveris] Processing complete!")
+    print(f"[Audiveris] Output directory: {output_dir}")
+
+    # 查找生成的输出文件
+    base_name = os.path.splitext(os.path.basename(input_path))[0]
+
+    # 可能的输出文件
+    mxl_path = os.path.join(output_dir, f"{base_name}.mxl")
+    omr_path = os.path.join(output_dir, f"{base_name}.omr")
+
+    # 列出输出目录内容
+    print(f"[Audiveris] Files in output directory:")
+    for f in os.listdir(output_dir):
+        print(f"  - {f}")
+
+    return {
+        'mxl_path': mxl_path if os.path.exists(mxl_path) else None,
+        'xml_path': None,  # 稍后提取
+        'omr_path': omr_path if os.path.exists(omr_path) else None,
+        'output_dir': output_dir,
+        'base_name': base_name
+    }
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="OMR using Audiveris - Convert sheet music images to MusicXML",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    python audiveris_to_musescore.py "D:\\scores\\sheet.png" "D:\\output"
+    python audiveris_to_musescore.py "D:\\scores\\piece.pdf"
+        """
+    )
+    parser.add_argument('input', help="Input image (PNG/JPG) or PDF file")
+    parser.add_argument('output_dir', nargs='?', default=None,
+                        help="Output directory (default: same as input)")
+
+    args = parser.parse_args()
+
+    input_path = os.path.abspath(args.input)
+
+    if args.output_dir:
+        output_dir = os.path.abspath(args.output_dir)
+    else:
+        # 默认输出到输入文件同目录
+        output_dir = os.path.join(os.path.dirname(input_path), 'audiveris_output')
+
+    try:
+        result = process_with_audiveris(input_path, output_dir)
+
+        print()
+        print("=" * 50)
+        print("Processing complete!")
+        print("=" * 50)
+
+        if result['mxl_path']:
+            print(f"MusicXML (.mxl): {result['mxl_path']}")
+            # 提取 XML 以便查看
+            xml_path = extract_musicxml_from_mxl(
+                result['mxl_path'],
+                output_dir
+            )
+            if xml_path:
+                print(f"MusicXML (.xml):  {xml_path}")
+
+        if result['omr_path']:
+            print(f"Project (.omr):    {result['omr_path']}")
+
+        print()
+        print("Open .mxl file in MuseScore to verify and edit.")
+
+    except Exception as e:
+        print(f"Error: {e}")
+        sys.exit(1)
+
+
+if __name__ == '__main__':
+    main()
@@ -0,0 +1,4 @@
+#!/usr/bin/env python3
+"""Example script - delete if not needed."""
+
+print("Hello from skill!")
@@ -0,0 +1,133 @@
+#!/usr/bin/env python3
+"""
+extract_pdf_images.py
+
+从 PDF 中提取五线谱图片，用于后续 OMR 处理。
+
+功能：
+1. 打开 PDF 文件
+2. 提取每页中的图片
+3. 保存为 PNG 文件
+
+Usage:
+    python extract_pdf_images.py <pdf_path> [output_dir]
+
+Examples:
+    python extract_pdf_images.py "D:/scores/sheet.pdf"
+
+    # 提取到指定目录
+    python extract_pdf_images.py "D:/scores/sheet.pdf" "D:/output/sheets"
+
+Dependencies:
+    pip install pymupdf
+"""
+
+import os
+import sys
+import argparse
+
+try:
+    import fitz  # PyMuPDF
+except ImportError:
+    print("Error: PyMuPDF not installed.")
+    print("Install with: pip install pymupdf")
+    sys.exit(1)
+
+
+def extract_images_from_pdf(pdf_path: str, output_dir: str = None) -> list:
+    """
+    从 PDF 中提取所有图片。
+
+    Args:
+        pdf_path: PDF 文件路径
+        output_dir: 输出目录，默认 temp/pdf_sheets
+
+    Returns:
+        提取的图片路径列表
+    """
+    if not os.path.exists(pdf_path):
+        raise FileNotFoundError(f"PDF not found: {pdf_path}")
+
+    if output_dir is None:
+        output_dir = os.path.join(os.path.dirname(pdf_path), 'temp', 'pdf_sheets')
+
+    os.makedirs(output_dir, exist_ok=True)
+
+    doc = fitz.open(pdf_path)
+    print(f"[PDF] Opened: {pdf_path}")
+    print(f"[PDF] Pages: {len(doc)}")
+
+    extracted = []
+
+    for page_num in range(len(doc)):
+        page = doc[page_num]
+        images = page.get_images()
+        print(f"[PDF] Page {page_num + 1}: {len(images)} image(s)")
+
+        for img_idx, img in enumerate(images):
+            xref = img[0]
+            pix = fitz.Pixmap(doc, xref)
+
+            # 处理颜色模式
+            if pix.n - pix.alpha < 4:
+                # RGB 或灰度
+                out_path = os.path.join(
+                    output_dir,
+                    f"page{page_num+1:03d}_img{img_idx+1:02d}.png"
+                )
+                pix.save(out_path)
+            else:
+                # CMYK，转换为 RGB
+                pix1 = fitz.Pixmap(fitz.csRGB, pix)
+                out_path = os.path.join(
+                    output_dir,
+                    f"page{page_num+1:03d}_img{img_idx+1:02d}.png"
+                )
+                pix1.save(out_path)
+
+            print(f"  -> {out_path}")
+            extracted.append(out_path)
+
+    doc.close()
+    return extracted
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Extract images from PDF for OMR processing",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+    python extract_pdf_images.py "D:\\scores\\sheet.pdf"
+    python extract_pdf_images.py "D:\\scores\\sheet.pdf" "D:\\output\\sheets"
+        """
+    )
+    parser.add_argument('pdf', help="Input PDF file")
+    parser.add_argument('output_dir', nargs='?', default=None,
+                        help="Output directory (default: temp/pdf_sheets in PDF dir)")
+
+    args = parser.parse_args()
+
+    pdf_path = os.path.abspath(args.pdf)
+    output_dir = os.path.abspath(args.output_dir) if args.output_dir else None
+
+    try:
+        images = extract_images_from_pdf(pdf_path, output_dir)
+
+        print()
+        print("=" * 50)
+        print(f"Extracted {len(images)} image(s)")
+        print("=" * 50)
+
+        if images:
+            print("\nNext steps:")
+            print(f"  1. Audiveris: & 'C:\\Program Files\\Audiveris\\Audiveris.exe' -batch -export -output <dir> {' '.join(images)}")
+            print(f"  2. Or use the audiveris_to_musescore.py script for each image")
+
+    except Exception as e:
+        print(f"Error: {e}")
+        sys.exit(1)
+
+
+if __name__ == '__main__':
+    main()
@@ -0,0 +1,334 @@
+#!/usr/bin/env python3
+"""
+grand_staff_merge.py
+
+Post-process homr's dual-part MusicXML output into a proper piano grand staff.
+
+What it does:
+1. Merges P1 (treble/G clef) and P2 (bass/F clef) into a single Part
+2. Adds <staves>2</staves> and <bracket type="brace"/> for piano grand staff
+3. Bass stays on staff=1/voice=1, treble moves to staff=2/voice=2
+4. Removes ALL <print> elements (MuseScore ignores them and causes extra systems)
+5. Preserves note durations, pitches, dynamics — only structural changes
+
+Usage:
+    python grand_staff_merge.py input.musicxml [output.musicxml]
+"""
+
+import sys
+import xml.etree.ElementTree as ET
+
+# Register namespaces to preserve them in output
+ET.register_namespace("m", "http://www.musescore.org/ns/mscore")
+ET.register_namespace("xlink", "http://www.w3.org/1999/xlink")
+ET.register_namespace(
+    "mx", "http://schemas.openxmlformats.org/markup-compatibility/2006"
+)
+
+
+def remove_print_elements(part_elem):
+    """Remove all <print> elements from a part (they cause extra system breaks in MuseScore)."""
+    removed = 0
+    for print_elem in part_elem.findall(".//print"):
+        for parent in part_elem.iter():
+            if print_elem in list(parent):
+                parent.remove(print_elem)
+                removed += 1
+                break
+    return removed
+
+
+def make_attributes_grand_staff(attrs_elem):
+    """Modify attributes element for grand staff: add staves, bracket, correct clefs."""
+    # Remove existing clef (will add both below)
+    existing_clef = attrs_elem.find("clef")
+    if existing_clef is not None:
+        attrs_elem.remove(existing_clef)
+
+    # Add <staves>2</staves> after <divisions>
+    divisions = attrs_elem.find("divisions")
+    staves_elem = ET.Element("staves")
+    staves_elem.text = "2"
+    if divisions is not None:
+        idx = list(attrs_elem).index(divisions) + 1
+        attrs_elem.insert(idx, staves_elem)
+    else:
+        attrs_elem.append(staves_elem)
+
+    # MusicXML clef order = visual top-to-bottom order
+    # Grand staff: G clef (treble) is visually at the TOP, so it comes FIRST in XML
+    # F clef (bass) is visually at the BOTTOM, so it comes SECOND
+    #
+    # CRITICAL: Staff numbering in MuseScore grand staff:
+    #   staff=1 = upper staff = treble (G clef)
+    #   staff=2 = lower staff = bass (F clef)
+    treble_clef = ET.SubElement(attrs_elem, "clef")
+    treble_sign = ET.SubElement(treble_clef, "sign")
+    treble_sign.text = "G"
+    treble_line = ET.SubElement(treble_clef, "line")
+    treble_line.text = "2"
+    treble_staff_elem = ET.SubElement(treble_clef, "staff")
+    treble_staff_elem.text = "1"  # Upper staff = staff=1
+
+    bass_clef = ET.SubElement(attrs_elem, "clef")
+    bass_sign = ET.SubElement(bass_clef, "sign")
+    bass_sign.text = "F"
+    bass_line = ET.SubElement(bass_clef, "line")
+    bass_line.text = "4"
+    bass_staff_elem = ET.SubElement(bass_clef, "staff")
+    bass_staff_elem.text = "2"  # Lower staff = staff=2
+
+    # Add bracket for piano brace (at end of attributes, after clefs)
+    bracket_elem = ET.SubElement(attrs_elem, "bracket")
+    bracket_elem.set("type", "brace")
+    bracket_elem.set("number", "1")
+    bracket_elem.text = ""
+
+
+def make_attributes_no_print(attrs_elem):
+    """Keep existing attributes but ensure no print element sibling."""
+    pass  # print elements handled at part level
+
+
+def process_part_measure(measure_elem, voice_map):
+    """
+    Process a measure from P1 or P2.
+    voice_map: {'staff': 1 or 2, 'voice': 1 or 2}
+    """
+    # Remove <print> elements from this measure
+    print_elem = measure_elem.find("print")
+    if print_elem is not None:
+        measure_elem.remove(print_elem)
+
+    for note in measure_elem.findall("note"):
+        # Update staff attribute
+        staff_elem = note.find("staff")
+        if staff_elem is not None:
+            staff_elem.text = str(voice_map["staff"])
+
+        # Update voice attribute
+        voice_elem = note.find("voice")
+        if voice_elem is not None:
+            voice_elem.text = str(voice_map["voice"])
+
+
+def merge_parts_to_grand_staff(xml_path, output_path=None):
+    """Main function: merge homr dual-part output into grand staff."""
+    if output_path is None:
+        output_path = xml_path.replace(".musicxml", "_grandstaff.musicxml")
+
+    tree = ET.parse(xml_path)
+    root = tree.getroot()
+
+    # Find part-list and part elements
+    part_list = root.find("part-list")
+    parts = root.findall("part")
+
+    if len(parts) < 2:
+        print(f"Warning: Expected 2 parts, found {len(parts)}. Nothing to merge.")
+        return
+
+    p1 = parts[0]  # treble (G clef)
+    p2 = parts[1]  # bass (F clef)
+
+    # Determine which is which by clef sign
+    def get_clef(part):
+        for m in part.findall("measure"):
+            clef = m.find(".//clef")
+            if clef is not None:
+                sign = clef.find("sign")
+                if sign is not None:
+                    return sign.text
+        return None
+
+    p1_clef = get_clef(p1)
+    p2_clef = get_clef(p2)
+
+    print(f"P1 clef: {p1_clef}, P2 clef: {p2_clef}")
+
+    # Identify treble and bass parts
+    if p1_clef == "G" and p2_clef == "F":
+        treble_part = p1
+        bass_part = p2
+        print("Detected: P1=treble, P2=bass")
+    elif p1_clef == "F" and p2_clef == "G":
+        treble_part = p2
+        bass_part = p1
+        print("Detected: P1=bass, P2=treble")
+    else:
+        # Default: P1=treble, P2=bass (homr's typical output)
+        treble_part = p1
+        bass_part = p2
+        print(f"Could not determine clefs definitively, defaulting: P1=treble, P2=bass")
+
+    # === Step 1: Update part-list to single Part ===
+    # Replace with single score-part
+    new_part_list = ET.Element("part-list")
+    score_part = ET.SubElement(new_part_list, "score-part")
+    score_part.set("id", "P1")
+    part_name = ET.SubElement(score_part, "part-name")
+    part_name.text = "Piano"
+    part_name.set("print-object", "yes")
+
+    # Add midi-instrument
+    midi_inst = ET.SubElement(score_part, "midi-instrument")
+    midi_inst.set("id", "P1-I1")
+    midi_ch = ET.SubElement(midi_inst, "midi-channel")
+    midi_ch.text = "1"
+    midi_prog = ET.SubElement(midi_inst, "midi-program")
+    midi_prog.text = "1"
+    vol = ET.SubElement(midi_inst, "volume")
+    vol.text = "100"
+    pan = ET.SubElement(midi_inst, "pan")
+    pan.text = "0"
+
+    # Replace old part-list
+    if part_list is not None:
+        root.remove(part_list)
+    root.insert(list(root).index(p1), new_part_list)
+
+    # === Step 2: Update bass_part measure 1 attributes ===
+    measures_bass = bass_part.findall("measure")
+    if measures_bass:
+        first_measure_bass = measures_bass[0]
+        attrs = first_measure_bass.find("attributes")
+        if attrs is not None:
+            make_attributes_grand_staff(attrs)
+
+    # Remove <print> from ALL measures in both parts
+    for p in [bass_part, treble_part]:
+        for m in p.findall("measure"):
+            print_e = m.find("print")
+            if print_e is not None:
+                m.remove(print_e)
+
+    # === Step 3: Change treble notes to staff=1 (upper), voice=1 ===
+    for m in treble_part.findall("measure"):
+        process_part_measure(m, {"staff": 1, "voice": 1})
+
+    # === Step 4: Change bass notes to staff=2 (lower), voice=2 ===
+    for m in bass_part.findall("measure"):
+        process_part_measure(m, {"staff": 2, "voice": 2})
+
+    # === Step 5: Interleave measures from both parts ===
+    # Create a new combined part
+    combined = ET.Element("part")
+    combined.set("id", "P1")
+
+    bass_measures = bass_part.findall("measure")
+    treble_measures = treble_part.findall("measure")
+
+    # Also remove print from treble first measure if it wasn't already removed
+    if treble_measures:
+        print_e = treble_measures[0].find("print")
+        if print_e is not None:
+            treble_measures[0].remove(print_e)
+
+    # Add first measure of treble (with attributes) to combined first
+    # But we already added grand staff attributes to bass first measure
+    # So treble first measure's attributes should be removed
+    treble_attrs_removed = 0
+    if treble_measures:
+        treble_first_attrs = treble_measures[0].find("attributes")
+        if treble_first_attrs is not None:
+            treble_measures[0].remove(treble_first_attrs)
+            treble_attrs_removed += 1
+
+    # For measures that appear in both, interleave notes
+    # Strategy: for each measure number, combine notes from both parts
+    bass_by_num = {m.get("number"): m for m in bass_measures}
+    treble_by_num = {m.get("number"): m for m in treble_measures}
+
+    all_measure_nums = sorted(
+        set(list(bass_by_num.keys()) + list(treble_by_num.keys())),
+        key=lambda x: int(x) if x is not None and x.isdigit() else 0,
+    )
+
+    for mnum in all_measure_nums:
+        bass_m = bass_by_num.get(mnum)
+        treble_m = treble_by_num.get(mnum)
+
+        # Create new combined measure
+        new_measure = ET.Element("measure")
+        if mnum is not None:
+            new_measure.set("number", mnum)
+
+        # Copy attributes from bass (which has grand staff setup)
+        if bass_m is not None:
+            attrs = bass_m.find("attributes")
+            if attrs is not None:
+                new_measure.append(ET.fromstring(ET.tostring(attrs)))
+
+        # Copy direction/direction-type from treble first measure (for tempo, etc.)
+        if treble_m is not None and mnum == "1":
+            for child in list(treble_m):
+                if child.tag not in (
+                    "note",
+                    "backup",
+                    "forward",
+                    "attributes",
+                    "barline",
+                    "print",
+                ):
+                    new_measure.append(ET.fromstring(ET.tostring(child)))
+
+        # Add notes from treble (staff=1, upper) — first in XML
+        if treble_m is not None:
+            for note in treble_m.findall("note"):
+                new_measure.append(ET.fromstring(ET.tostring(note)))
+            for backup in treble_m.findall("backup"):
+                new_measure.append(ET.fromstring(ET.tostring(backup)))
+            for fwd in treble_m.findall("forward"):
+                new_measure.append(ET.fromstring(ET.tostring(fwd)))
+
+        # Add notes from bass (staff=2, lower) — after treble
+        if bass_m is not None:
+            for note in bass_m.findall("note"):
+                new_measure.append(ET.fromstring(ET.tostring(note)))
+            for backup in bass_m.findall("backup"):
+                new_measure.append(ET.fromstring(ET.tostring(backup)))
+            for fwd in bass_m.findall("forward"):
+                new_measure.append(ET.fromstring(ET.tostring(fwd)))
+            for barline in bass_m.findall("barline"):
+                new_measure.append(ET.fromstring(ET.tostring(barline)))
+
+        combined.append(new_measure)
+
+    # === Step 6: Replace original parts with combined ===
+    root.remove(p1)
+    if p2 != p1:
+        try:
+            root.remove(p2)
+        except ValueError:
+            pass
+    root.append(combined)
+
+    # === Step 7: Write output ===
+    # ET.indent for pretty printing (Python 3.9+)
+    try:
+        ET.indent(root)
+    except AttributeError:
+        pass  # Python < 3.9
+
+    tree.write(output_path, encoding="UTF-8", xml_declaration=True)
+
+    print(f"\nGrand staff conversion complete!")
+    print(f"   Input:  {xml_path}")
+    print(f"   Output: {output_path}")
+    print(f"   Removed <print> elements to prevent extra system breaks")
+    print(f"   Added: <staves>2</staves>, bracket type=brace")
+    print(f"   Treble: staff=1 (upper), voice=1 | Bass: staff=2 (lower), voice=2")
+    print(f"   Measures: {len(all_measure_nums)}")
+
+    return output_path
+
+
+if __name__ == "__main__":
+    if len(sys.argv) < 2:
+        print(__doc__)
+        sys.exit(1)
+
+    input_file = sys.argv[1]
+    output_file = sys.argv[2] if len(sys.argv) > 2 else None
+
+    merge_parts_to_grand_staff(input_file, output_file)
@@ -0,0 +1,183 @@
+#!/usr/bin/env python3
+"""
+homr_to_musescore.py
+
+完整的 homr → MusicXML → MuseScore 可用格式 流水线。
+
+功能：
+1. 调用 homr 识别五线谱图片
+2. 删除 <print> 元素（修复错误分行）
+3. 添加花括号分组（钢琴谱 grand staff）
+
+Usage:
+    python homr_to_musescore.py image.png [output.musicxml]
+
+Dependencies:
+    - homr (pip install homr)
+    - Python 3.8+ (标准库 xml.etree.ElementTree)
+
+Environment:
+    依赖 base conda 环境（已包含 homr）：
+    conda activate base
+"""
+
+import sys
+import os
+import tempfile
+import shutil
+import xml.etree.ElementTree as ET
+
+# homr 依赖检查
+try:
+    from homr.main import process_image, ProcessingConfig, XmlGeneratorArguments
+    HAS_HOMR = True
+except ImportError:
+    HAS_HOMR = False
+
+
+def step1_homr(image_path: str) -> str:
+    """Step 1: 用 homr 识别图片，输出 MusicXML 路径。"""
+    print(f"[Step 1] Running homr OCR on: {image_path}")
+
+    if not HAS_HOMR:
+        print("Error: homr not installed. Run: pip install homr")
+        sys.exit(1)
+
+    # homr 不支持中文路径，用临时文件
+    tmp_dir = tempfile.mkdtemp(prefix='homr_')
+    tmp_image = os.path.join(tmp_dir, os.path.basename(image_path))
+    shutil.copy(image_path, tmp_image)
+
+    try:
+        config = ProcessingConfig(False, False, False, False, -1)
+        xml_args = XmlGeneratorArguments(False, None, None)
+        result = process_image(tmp_image, config, xml_args)
+        musicxml_path = result
+        print(f"[Step 1] Done: {musicxml_path}")
+        return musicxml_path
+    finally:
+        shutil.rmtree(tmp_dir, ignore_errors=True)
+
+
+def step2_remove_print(input_path: str, output_path: str) -> None:
+    """Step 2: 删除所有 <print> 元素。"""
+    print(f"[Step 2] Removing <print> elements from: {input_path}")
+
+    ET.register_namespace('m', 'http://www.musescore.org/ns/mscore')
+    ET.register_namespace('xlink', 'http://www.w3.org/1999/xlink')
+
+    tree = ET.parse(input_path)
+    root = tree.getroot()
+    removed_count = 0
+
+    for elem in root.iter():
+        to_remove = [c for c in list(elem) if c.tag == 'print']
+        for c in to_remove:
+            elem.remove(c)
+            removed_count += 1
+
+    try:
+        ET.indent(root)
+    except AttributeError:
+        pass
+
+    tree.write(output_path, encoding='UTF-8', xml_declaration=True)
+    print(f"[Step 2] Removed {removed_count} <print> element(s)")
+
+
+def step3_add_brace(input_path: str, output_path: str) -> None:
+    """Step 3: 添加花括号分组。"""
+    print(f"[Step 3] Adding brace grouping to: {input_path}")
+
+    ET.register_namespace('m', 'http://www.musescore.org/ns/mscore')
+    ET.register_namespace('xlink', 'http://www.w3.org/1999/xlink')
+
+    tree = ET.parse(input_path)
+    root = tree.getroot()
+    part_list = root.find('part-list')
+
+    if part_list is None:
+        print("[Step 3] Warning: no <part-list> found, skipping brace")
+        shutil.copy(input_path, output_path)
+        return
+
+    group_start = ET.Element('part-group')
+    group_start.set('number', '1')
+    group_start.set('type', 'start')
+    ET.SubElement(group_start, 'group-symbol').text = 'brace'
+    ET.SubElement(group_start, 'group-barline').text = 'bracket'
+    ET.SubElement(group_start, 'group-time')
+    part_list.insert(0, group_start)
+
+    for sp in part_list.findall('score-part'):
+        g = ET.SubElement(sp, 'group')
+        g.text = '1'
+
+    part_list.append(ET.Element('part-group', number='1', type='stop'))
+
+    try:
+        ET.indent(root)
+    except AttributeError:
+        pass
+
+    tree.write(output_path, encoding='UTF-8', xml_declaration=True)
+    print(f"[Step 3] Done")
+
+
+def main():
+    if len(sys.argv) < 2:
+        print(__doc__)
+        sys.exit(1)
+
+    image_path = sys.argv[1]
+    if not os.path.exists(image_path):
+        print(f"Error: file not found: {image_path}")
+        sys.exit(1)
+
+    # 确定输出文件名
+    if len(sys.argv) > 2:
+        output_path = sys.argv[2]
+    else:
+        base = os.path.splitext(os.path.basename(image_path))[0]
+        output_path = os.path.join(os.path.dirname(image_path), f"{base}_final.musicxml")
+
+    # 创建临时目录存放中间文件
+    tmp_dir = tempfile.mkdtemp(prefix='homr_pipeline_')
+
+    try:
+        # Step 1: homr 识别
+        raw_xml = os.path.join(tmp_dir, 'raw.musicxml')
+        step1_homr(image_path)
+
+        # homr 会在图片同目录生成 .musicxml 文件
+        expected = image_path.replace(os.path.splitext(image_path)[1], '.musicxml')
+        if os.path.exists(expected):
+            raw_xml = expected
+        else:
+            # 尝试从 tmp_dir 找
+            candidates = [f for f in os.listdir(tmp_dir) if f.endswith('.musicxml')]
+            if candidates:
+                raw_xml = os.path.join(tmp_dir, candidates[0])
+
+        if not os.path.exists(raw_xml):
+            print(f"Error: homr did not produce output file")
+            sys.exit(1)
+
+        # Step 2: 删除 print
+        no_print_xml = os.path.join(tmp_dir, 'no_print.musicxml')
+        step2_remove_print(raw_xml, no_print_xml)
+
+        # Step 3: 添加花括号
+        step3_add_brace(no_print_xml, output_path)
+
+        print()
+        print(f"Pipeline complete!")
+        print(f"Final output: {output_path}")
+        print(f"Open in MuseScore to verify: 5 measures per line, brace on grand staff")
+
+    finally:
+        shutil.rmtree(tmp_dir, ignore_errors=True)
+
+
+if __name__ == '__main__':
+    main()
@@ -0,0 +1,32 @@
+#!/usr/bin/env python3
+"""删除 MusicXML 中所有 <print> 元素。"""
+import sys
+import xml.etree.ElementTree as ET
+
+if len(sys.argv) < 2:
+    print('Usage: python remove_print.py input.musicxml [output.musicxml]')
+    sys.exit(1)
+
+input_file = sys.argv[1]
+output_file = sys.argv[2] if len(sys.argv) > 2 else input_file.replace('.musicxml', '_no_print.musicxml')
+
+ET.register_namespace('m', 'http://www.musescore.org/ns/mscore')
+ET.register_namespace('xlink', 'http://www.w3.org/1999/xlink')
+
+tree = ET.parse(input_file)
+root = tree.getroot()
+removed_count = 0
+for elem in root.iter():
+    to_remove = [c for c in list(elem) if c.tag == 'print']
+    for c in to_remove:
+        elem.remove(c)
+        removed_count += 1
+
+try:
+    ET.indent(root)
+except AttributeError:
+    pass
+
+tree.write(output_file, encoding='UTF-8', xml_declaration=True)
+print(f'Removed {removed_count} <print> element(s)')
+print(f'Output: {output_file}')