Refactor: rewrite GUI, add run.bat, update docs
- New GUI (gui.py) calls same core functions as CLI - Add run.bat for parameterized CLI usage - Simplify run_lesson1.bat to just call run.bat - Update README and ARCHITECTURE docs - Add LICENSE
This commit is contained in:
@@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2025
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
@@ -1,105 +1,101 @@
|
||||
# 🎹 Piano Highlight Generator
|
||||
# Lesson Highlights Generator
|
||||
|
||||
钢琴课精华视频生成工具。自动从完整课程视频中提取精华片段,转录、纠错、生成字幕,批量烧录到视频中。
|
||||
教学视频精华片段生成工具。输入课程视频 + PPT,自动提取精华片段,转录、纠错、生成字幕,批量烧录到视频中。
|
||||
|
||||
## ✨ 功能特点
|
||||
## 功能特点
|
||||
|
||||
- **智能提取**: 自动检测视频中的精彩片段
|
||||
- **语音转录**: 支持 Whisper 多模型(tiny/base/small/medium/large)
|
||||
- **AI 纠错**: LLM 自动纠正转录错误,优化标题
|
||||
- **双语字幕**: 支持双轨字幕(标题轨 + 内容轨)
|
||||
- **状态持久化**: 支持暂停/恢复,可中断继续
|
||||
- **手动编辑**: 生成前可人工审核编辑标题和字幕内容
|
||||
- **PPT 驱动提取**:根据 PPT 知识点定位视频中的讲解片段
|
||||
- **语音转录 + 纠错**:Whisper 转录 + LLM 批量校正
|
||||
- **双轨字幕**:标题轨 + 内容轨
|
||||
- **CLI / GUI 双入口**:共享同一套底层逻辑
|
||||
|
||||
## 📋 系统要求
|
||||
## 快速开始
|
||||
|
||||
- Windows 10/11 或 macOS 10.15+
|
||||
- Python 3.10+
|
||||
- FFmpeg(必须,添加到 PATH)
|
||||
### 1. 配置
|
||||
|
||||
## 🚀 快速开始
|
||||
|
||||
### 1. 安装
|
||||
复制配置文件并填入 API Key:
|
||||
|
||||
```bash
|
||||
# 克隆项目
|
||||
git clone <repo-url>
|
||||
cd piano-highlight-app
|
||||
cp config.ini.example config.ini
|
||||
# 编辑 config.ini,填入 api_key
|
||||
```
|
||||
|
||||
# 创建虚拟环境(推荐)
|
||||
python -m venv venv
|
||||
.\venv\Scripts\activate # Windows
|
||||
source venv/bin/activate # Linux/macOS
|
||||
### 2. 安装依赖
|
||||
|
||||
# 安装依赖
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
|
||||
# 安装 FFmpeg(Windows - 使用 winget)
|
||||
winget install Gyan.FFmpeg
|
||||
|
||||
# 或 macOS
|
||||
brew install ffmpeg
|
||||
```
|
||||
|
||||
### 2. 运行
|
||||
### 3. 运行
|
||||
|
||||
**GUI(推荐):**
|
||||
```bash
|
||||
python src/main.py
|
||||
.\start.bat
|
||||
```
|
||||
|
||||
### 3. 配置
|
||||
**CLI:**
|
||||
```bash
|
||||
python src/cli.py --video video.mp4 --ppt presentation.pptx --output ./output
|
||||
```
|
||||
|
||||
首次运行需要配置:
|
||||
1. **API 设置**: 选择 API 提供商(DeepSeek/硅基流动),输入 API Key
|
||||
2. **视频设置**: 选择输入视频、输出目录
|
||||
3. **转录设置**: 选择 Whisper 模型(推荐 medium)
|
||||
或使用示例脚本:
|
||||
```bash
|
||||
.\run_lesson1.bat
|
||||
```
|
||||
|
||||
### 4. 生成
|
||||
## 项目结构
|
||||
|
||||
1. 点击「开始处理」
|
||||
2. 等待各步骤完成
|
||||
3. **标题确认**: LLM 生成标题后,审核并编辑
|
||||
4. **字幕确认**: 查看字幕内容,可进一步编辑
|
||||
5. 等待烧录完成
|
||||
```
|
||||
lesson-highlights/
|
||||
├── src/
|
||||
│ ├── main.py # GUI 入口
|
||||
│ ├── gui.py # GUI(参数输入,调用底层)
|
||||
│ ├── cli.py # CLI 入口
|
||||
│ └── core/ # 共享底层
|
||||
│ ├── ppt_parser.py # PPT 解析 + clips 生成
|
||||
│ ├── pipeline.py # 视频处理流水线
|
||||
│ ├── subtitle.py # 字幕生成
|
||||
│ └── ...
|
||||
├── config.ini # API 配置(不提交 git)
|
||||
├── config.ini.example # 配置模板
|
||||
├── start.bat # 启动 GUI
|
||||
└── run_lesson1.bat # CLI 示例
|
||||
```
|
||||
|
||||
## 📁 输出文件
|
||||
## 工作流程
|
||||
|
||||
1. **PPT 解析**:提取 PPT 文本和知识点
|
||||
2. **Whisper 转录**:将视频语音转成文本
|
||||
3. **LLM 校正**:批量校正转录错误
|
||||
4. **片段提取**:根据 PPT 知识点定位视频片段
|
||||
5. **字幕烧录**:生成双轨字幕并烧入视频
|
||||
6. **合并输出**:拼接所有片段为最终视频
|
||||
|
||||
## API 配置
|
||||
|
||||
编辑 `config.ini`:
|
||||
|
||||
```ini
|
||||
[api]
|
||||
api_host = "https://ark.cn-beijing.volces.com/api/coding/v3"
|
||||
api_key = "your_api_key_here"
|
||||
```
|
||||
|
||||
支持火山方舟(doubao-seed-2.0-lite)或兼容 OpenAI API 的后端。
|
||||
|
||||
## 输出
|
||||
|
||||
```
|
||||
output/
|
||||
├── state.json # 处理状态
|
||||
├── clips/ # 提取的片段
|
||||
│ └── clip_001.mp4
|
||||
├── generated_config.yaml # 生成的 clips 配置
|
||||
├── clips/ # 提取的片段视频
|
||||
├── subtitles/ # 字幕文件
|
||||
│ ├── clip_001_title.srt # 标题轨
|
||||
│ └── clip_001_content.srt # 内容轨
|
||||
└── final/ # 最终输出
|
||||
└── clip_001_final.mp4
|
||||
└── final.mp4 # 最终输出
|
||||
```
|
||||
|
||||
## 🔧 流水线步骤
|
||||
## 系统要求
|
||||
|
||||
1. **extract** - 片段提取
|
||||
2. **transcribe** - 语音转录
|
||||
3. **title_correct** - 标题生成与纠错
|
||||
4. **generate_subtitles** - 字幕生成
|
||||
5. **merge** - 片段合并
|
||||
6. **burn** - 字幕烧录
|
||||
|
||||
## ⚠️ 常见问题
|
||||
|
||||
### Q: 提示 "FFmpeg not found"
|
||||
A: 确保 FFmpeg 已安装并添加到系统 PATH。重启终端后重试。
|
||||
|
||||
### Q: API 调用失败
|
||||
A: 检查 API Key 是否正确,网络是否正常,或切换 API 提供商。
|
||||
|
||||
### Q: 磁盘空间不足
|
||||
A: 清理输出目录或更换到空间更大的磁盘。
|
||||
|
||||
## 📄 许可证
|
||||
|
||||
MIT License
|
||||
|
||||
## 🤝 贡献
|
||||
|
||||
欢迎提交 Issue 和 Pull Request!
|
||||
- Python 3.10+
|
||||
- FFmpeg(已打包在 `ffmpeg/` 目录)
|
||||
- PySide6(GUI)
|
||||
- faster-whisper(转录,可选)
|
||||
|
||||
+65
-63
@@ -1,90 +1,92 @@
|
||||
# 架构设计
|
||||
|
||||
## 1. 技术栈
|
||||
## 1. 核心原则
|
||||
|
||||
| 层级 | 技术 | 选型理由 |
|
||||
|------|------|----------|
|
||||
| GUI 框架 | PySide6 (Qt for Python) | LGPL 许可,功能完备,信号槽机制适合异步更新 |
|
||||
| 打包工具 | Nuitka | 编译为 C,性能好,体积小 |
|
||||
| 状态持久化 | JSON 文件 | 简单,无需数据库依赖 |
|
||||
| 核心模块 | 复用现有脚本 | video.py, subtitle.py, llm.py, corrections.py |
|
||||
| 配置格式 | YAML/JSON | 用户友好,可读性好 |
|
||||
**CLI 和 GUI 共用同一套底层类库**,仅在表示层有差异:
|
||||
- **CLI**:命令行参数输入,日志输出到终端
|
||||
- **GUI**:PySide6 界面,参数输入界面化,日志输出到文本区
|
||||
|
||||
## 2. 项目结构
|
||||
|
||||
```
|
||||
piano-highlight-app/
|
||||
lesson-highlights/
|
||||
├── src/
|
||||
│ ├── main.py # GUI 入口
|
||||
│ ├── gui.py # GUI(参数输入 → 调用底层)
|
||||
│ ├── cli.py # CLI 入口
|
||||
│ └── core/ # 共享底层
|
||||
│ ├── __init__.py
|
||||
│ ├── main.py # 应用入口
|
||||
│ ├── app.py # QMainWindow 主窗口
|
||||
│ ├── gui/ # GUI 组件
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── config_panel.py # 配置面板
|
||||
│ │ ├── progress_view.py # 进度监控
|
||||
│ │ ├── title_editor.py # 标题编辑器
|
||||
│ │ └── log_view.py # 日志窗口
|
||||
│ ├── logic/ # 业务逻辑
|
||||
│ │ ├── __init__.py
|
||||
│ │ ├── config_manager.py # 配置管理
|
||||
│ │ ├── pipeline_controller.py # 流水线控制
|
||||
│ │ ├── state_manager.py # 状态管理
|
||||
│ │ └── worker.py # 后台工作线程
|
||||
│ └── core/ # 核心模块(复用)
|
||||
│ ├── __init__.py
|
||||
│ ├── constants.py # 常量
|
||||
│ ├── utils.py # 工具函数
|
||||
│ ├── video.py # 视频处理
|
||||
│ ├── subtitle.py # 字幕处理
|
||||
│ ├── ppt_parser.py # PPT 解析 + LLM clips 提取
|
||||
│ ├── pipeline.py # 视频处理流水线
|
||||
│ ├── subtitle.py # 字幕生成
|
||||
│ ├── video.py # 视频处理(提取/合并/烧录)
|
||||
│ ├── llm.py # LLM 调用
|
||||
│ └── corrections.py # 纠错规则
|
||||
├── assets/ # 资源文件
|
||||
│ └── icons/
|
||||
├── requirements.txt # 依赖
|
||||
├── pyproject.toml # 项目配置
|
||||
├── nuitka_options.py # Nuitka 打包配置
|
||||
└── README.md
|
||||
│ ├── corrections.py # 术语纠正
|
||||
│ ├── constants.py # 常量配置
|
||||
│ └── errors.py # 错误处理
|
||||
├── config.ini # API 配置(不提交 git)
|
||||
├── config.ini.example # 配置模板
|
||||
├── run.bat # 通用 CLI 启动器
|
||||
├── run_lesson1.bat # 预设课程示例
|
||||
└── start.bat # GUI 启动器
|
||||
```
|
||||
|
||||
## 3. 核心类设计
|
||||
## 3. 核心模块
|
||||
|
||||
### StateManager(状态管理)
|
||||
### `parse_ppt_to_config()`
|
||||
|
||||
负责状态持久化,支持暂停/恢复。
|
||||
一键完成 PPT → clips 配置的完整流程:
|
||||
|
||||
### PipelineController(流水线控制)
|
||||
1. **PPT 解析**:提取文本和知识点
|
||||
2. **Whisper 转录**:视频 → `full_transcript.json`
|
||||
3. **LLM 校正**:批量校正 → `corrected_transcript.json`
|
||||
4. **LLM 提取片段**:根据知识点定位视频片段 → clips
|
||||
5. **重叠合并**:合并重叠片段
|
||||
|
||||
管理处理流程的 6 个步骤:
|
||||
1. extract - 片段提取
|
||||
2. transcribe - 语音转录
|
||||
3. title_correct - 标题生成与纠错
|
||||
4. generate_subtitles - 字幕生成
|
||||
5. merge - 片段合并
|
||||
6. burn - 字幕烧录
|
||||
### `Pipeline`
|
||||
|
||||
### Worker(后台工作线程)
|
||||
视频处理流水线:
|
||||
|
||||
在独立线程中执行流水线,通过信号与 UI 通信。
|
||||
1. **extract**:按时间戳提取片段
|
||||
2. **transcribe**:逐片段 Whisper 转录
|
||||
3. **correct_titles**:LLM 标题纠正
|
||||
4. **generate_subtitles**:生成双轨字幕
|
||||
5. **merge**:合并片段
|
||||
6. **burn**:烧录字幕
|
||||
|
||||
## 4. 流水线状态机
|
||||
## 4. 数据流
|
||||
|
||||
```
|
||||
Ready → Extracting → Transcribing → Title Correcting → Generating Subtitles → Merging → Burning → Completed
|
||||
↑ ↓
|
||||
└───────────── 用户可暂停并编辑标题 ─────────────┘
|
||||
视频 + PPT
|
||||
↓
|
||||
parse_ppt_to_config()
|
||||
↓
|
||||
config = {
|
||||
"video_src": ...,
|
||||
"clips": [{"title": ..., "start": ..., "end": ...}, ...],
|
||||
"output_dir": ...,
|
||||
"term_corrections": {...}
|
||||
}
|
||||
↓
|
||||
Pipeline(config).run()
|
||||
↓
|
||||
final.mp4
|
||||
```
|
||||
|
||||
## 5. 信号流
|
||||
## 5. 配置来源
|
||||
|
||||
| 信号 | 方向 | 说明 |
|
||||
|------|------|------|
|
||||
| config_changed | UI → Controller | 配置变更 |
|
||||
| progress_signal | Worker → UI | 进度更新 |
|
||||
| titles_ready_signal | Worker → UI | 标题列表准备好 |
|
||||
| titles_confirmed_signal | UI → Controller | 用户确认的标题 |
|
||||
API 配置统一从 `config.ini` 读取,不硬编码在代码中。
|
||||
|
||||
## 6. 状态文件格式
|
||||
CLI 支持参数覆盖:
|
||||
- `--api-key`
|
||||
- `--api-host`
|
||||
- `--verbose`
|
||||
|
||||
JSON 格式,包含配置、流水线状态、clips 列表等。
|
||||
## 6. 状态持久化
|
||||
|
||||
详见 design.md。
|
||||
中间结果保存在 `output/intermediates/`:
|
||||
- `full_transcript.json` - 原始转录
|
||||
- `corrected_transcript.json` - LLM 校正后
|
||||
- `ppt_knowledge_and_cleaned.json` - PPT 知识点和清理后文本
|
||||
|
||||
复用时检测 checkpoint,避免重复 LLM 调用。
|
||||
|
||||
@@ -0,0 +1,91 @@
|
||||
@echo off
|
||||
setlocal EnableDelayedExpansion
|
||||
|
||||
:: ============================================
|
||||
:: Lesson Highlights Generator - CLI Runner
|
||||
:: ============================================
|
||||
:: 用法:
|
||||
:: run.bat "video.mp4" "presentation.pptx" "output_dir"
|
||||
:: run.bat <- 显示帮助
|
||||
:: ============================================
|
||||
|
||||
set "PROJECT_DIR=%~dp0"
|
||||
set "SRC_DIR=%PROJECT_DIR%src"
|
||||
set "VENV_DIR=%PROJECT_DIR%venv"
|
||||
set "PYTHON_SRC=D:\ProgramData\anaconda3\envs\py312_cuda\python.exe"
|
||||
set "LOG_FILE=%PROJECT_DIR%temp\cli_run_log.txt"
|
||||
|
||||
:: 参数检查
|
||||
if "%~1"=="" (
|
||||
echo 用法:
|
||||
echo run.bat "视频路径" "PPT路径" "输出目录"
|
||||
echo.
|
||||
echo 示例:
|
||||
echo run.bat "D:\Videos\lesson.mp4" "D:\PPT\lesson.pptx" "D:\Output"
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
set "VIDEO=%~f1"
|
||||
set "PPT=%~f2"
|
||||
set "OUTPUT=%~f3"
|
||||
|
||||
:: 验证文件存在
|
||||
if not exist "%VIDEO%" (
|
||||
echo 错误: 视频文件不存在: %VIDEO%
|
||||
exit /b 1
|
||||
)
|
||||
if not exist "%PPT%" (
|
||||
echo 错误: PPT文件不存在: %PPT%
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
:: 创建输出目录
|
||||
if not exist "%OUTPUT%" mkdir "%OUTPUT%"
|
||||
|
||||
:: 加载 config.ini 中的 API 配置
|
||||
set "API_KEY="
|
||||
set "API_HOST="
|
||||
for /f "usebackq tokens=1,2 delims== " %%a in ("%PROJECT_DIR%config.ini") do (
|
||||
if "%%a"=="api_key" set "API_KEY=%%b"
|
||||
if "%%a"=="api_host" set "API_HOST=%%b"
|
||||
)
|
||||
|
||||
:: 没有配置则使用默认值
|
||||
if not defined API_KEY (
|
||||
echo 警告: config.ini 中未找到 api_key,使用默认值
|
||||
set "API_KEY=b0359bed-09f2-49e2-a53c-32ba057412e3"
|
||||
set "API_HOST=https://ark.cn-beijing.volces.com/api/coding/v3"
|
||||
)
|
||||
|
||||
:: 设置 FFmpeg PATH
|
||||
set "FFMPEG_BIN=%PROJECT_DIR%ffmpeg\ffmpeg-8.1-full_build\bin"
|
||||
set "PATH=%FFMPEG_BIN%;%PATH%"
|
||||
|
||||
echo ============================================
|
||||
echo Lesson Highlights Generator - CLI
|
||||
echo ============================================
|
||||
echo 视频: %VIDEO%
|
||||
echo PPT: %PPT%
|
||||
echo 输出: %OUTPUT%
|
||||
echo ============================================
|
||||
echo.
|
||||
|
||||
:: 清理日志
|
||||
del /f /q "%LOG_FILE%" 2>nul
|
||||
|
||||
:: 运行 CLI
|
||||
"%VENV_DIR%\Scripts\python.exe" "%SRC_DIR%\cli.py" ^
|
||||
--video "%VIDEO%" ^
|
||||
--ppt "%PPT%" ^
|
||||
--output "%OUTPUT%" ^
|
||||
--api-key "%API_KEY%" ^
|
||||
--api-host "%API_HOST%" ^
|
||||
--verbose
|
||||
|
||||
echo.
|
||||
echo Exit: %errorlevel%
|
||||
if errorlevel 1 (
|
||||
echo 运行失败,详见日志: %LOG_FILE%
|
||||
) else (
|
||||
echo 完成!
|
||||
)
|
||||
+5
-12
@@ -1,13 +1,6 @@
|
||||
@echo off
|
||||
chcp 65001 >nul
|
||||
echo Cleaning pycache...
|
||||
rmdir /s /q "D:\F\NewI\opencode\daily-workspace\projects\piano-highlight-app\src\__pycache__" 2>nul
|
||||
rmdir /s /q "D:\F\NewI\opencode\daily-workspace\projects\piano-highlight-app\src\core\__pycache__" 2>nul
|
||||
echo Cache cleaned.
|
||||
echo.
|
||||
echo Running CLI...
|
||||
del "D:\F\NewI\opencode\daily-workspace\temp\cli_run_log.txt" 2>nul
|
||||
"D:\ProgramData\anaconda3\envs\py312_cuda\python.exe" "D:\F\NewI\opencode\daily-workspace\projects\piano-highlight-app\run_lesson1.py"
|
||||
echo.
|
||||
echo Exit: %errorlevel%
|
||||
pause
|
||||
:: 预置脚本:运行福田夜校第一节课
|
||||
call "%~dp0run.bat" ^
|
||||
"D:\F\yc\课程上架\福田商圈夜校\课程视频\直播回放-03月18日.mp4" ^
|
||||
"D:\F\yc\课程上架\福田商圈夜校\课程视频\钢琴演奏入门第一课.pptx" ^
|
||||
"D:\F\NewI\opencode\daily-workspace\projects\piano-lesson-highlights\cases\lesson1\output_cli_full"
|
||||
|
||||
Reference in New Issue
Block a user