Initial: multi-agent XMPP communication system with dashboard
- Platform-based architecture (Windows/Linux/Mac) - Agent instance registry (agents.yaml) - Management dashboard with cross-platform monitoring - xmpp_bot with HTTP bridge + health endpoints - wechat_agent with WeChat-Hermes bridging - Platform services: ProcessGuardian, HealthProbe, APIRouter, ChannelBridge - Deployment: systemd (Linux) + PowerShell (Windows) - Monitoring: SSH+ejabberdctl for cross-platform presence
This commit is contained in:
+128
@@ -0,0 +1,128 @@
|
||||
# AgentsMeeting — 运维手册
|
||||
|
||||
> 版本: v2.0 | 日期: 2026-06-12
|
||||
|
||||
---
|
||||
|
||||
## 日常检查
|
||||
|
||||
### Dashboard
|
||||
|
||||
打开 `http://192.168.1.246:5803` 查看所有 Agent 和平台服务状态。
|
||||
|
||||
- 绿色 = 在线
|
||||
- 黄色 = degraded(进程活着但 XMPP 不稳)
|
||||
- 红色 = 离线
|
||||
- 灰色 = 未知(远程 Agent,无法检测)
|
||||
|
||||
展开 Agent 卡片可查看实时日志。
|
||||
|
||||
### 命令行检查
|
||||
|
||||
```powershell
|
||||
# Windows 快速状态
|
||||
powershell -File deploy\windows\check.ps1
|
||||
```
|
||||
|
||||
```bash
|
||||
# Linux 所有 systemd 服务
|
||||
systemctl status agentsmeeting-dashboard hermes-gateway@{profile} xmpp-bot-{name}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 监控架构
|
||||
|
||||
```
|
||||
Dashboard (:5803, Linux)
|
||||
│
|
||||
├── Docker exec ejabberdctl → 在线 JID 列表(跨平台权威)
|
||||
├── GET 192.168.1.16:5802/health → xmpp_bot XMPP 连接状态
|
||||
├── GET 192.168.1.16:5801/health → wechat_agent hermes 连接状态
|
||||
└── TCP connect 192.168.1.16:8787 → api_proxy 端口可达性
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## systemd 服务(Linux)
|
||||
|
||||
| 服务 | 命令 |
|
||||
|------|------|
|
||||
| agentsmeeting-dashboard | `systemctl status/restart agentsmeeting-dashboard` |
|
||||
| hermes-gateway@main | `systemctl status hermes-gateway@main` |
|
||||
| hermes-gateway@zhiwei | `systemctl status hermes-gateway@zhiwei` |
|
||||
| hermes-gateway@xiaoguo | `systemctl status hermes-gateway@xiaoguo` |
|
||||
| xmpp-bot-mohe | `systemctl status xmpp-bot-mohe` |
|
||||
| xmpp-bot-zhiwei | `systemctl status xmpp-bot-zhiwei` |
|
||||
|
||||
---
|
||||
|
||||
## 健康端点
|
||||
|
||||
| 服务 | URL | 含义 |
|
||||
|------|-----|------|
|
||||
| xmpp_bot | `GET :5802/health` | `xmpp_connected` = XMPP 是否在线 |
|
||||
| wechat_agent | `GET :5801/health` | `hermes_connected` = 到莫荷 gateway 是否通 |
|
||||
| Dashboard | `GET :5803/api/health` | Dashboard 自身是否正常 |
|
||||
| Dashboard | `GET :5803/api/ejabberd` | ejabberd 在线用户列表 |
|
||||
| Dashboard | `GET :5803/api/platform` | 平台服务状态 |
|
||||
|
||||
---
|
||||
|
||||
## 日志位置
|
||||
|
||||
| 日志 | Windows 路径 | 用途 |
|
||||
|------|-------------|------|
|
||||
| xmpp_bot.log | `gateway\logs\` | bot 连接/消息/HTTP 桥 |
|
||||
| bridge.log | `gateway\logs\` | LLM API 调用 |
|
||||
| watchdog.log | `gateway\logs\` | 看门狗启停 |
|
||||
| health_check.log | `gateway\logs\` | 5 分钟健康检查 |
|
||||
| dashboard.log | `gateway\logs\` | Dashboard 运行日志 |
|
||||
| mohe_inbox.log | `gateway\logs\` | 莫荷消息记录 |
|
||||
|
||||
Linux Dashboard 日志:`sudo journalctl -u agentsmeeting-dashboard -f`
|
||||
|
||||
---
|
||||
|
||||
## 常见故障
|
||||
|
||||
### Bot 频繁断连
|
||||
|
||||
**症状**: 日志每 ~50 秒出现 `disconnected, reconnecting...`
|
||||
|
||||
**根因**: ejabberd `mod_ping: timeout_action: kill` 在 frp 隧道延迟下超时
|
||||
|
||||
**已修复**: `timeout_action: none`
|
||||
|
||||
### MUC 加群失败
|
||||
|
||||
**症状**: `MUC join timeout (1/3) ... MUC setup failed`
|
||||
|
||||
**根因**: ejabberd TLS 证书未覆盖 `conference.yoin.fun`
|
||||
|
||||
**已修复**: 生成自签证书 `conference.pem` 并加入 certfiles;用 SSH+ejabberdctl 绕过跨平台监控
|
||||
|
||||
### API Key 额度超限
|
||||
|
||||
**症状**: bridge.log 显示 `HTTP 429`,bot 不回复
|
||||
|
||||
**处理**: 等待配额重置(火山每月 15 日 00:00 CST),或切换 provider
|
||||
|
||||
### 两个 bot 同时跑
|
||||
|
||||
**症状**: 消息重复回应
|
||||
|
||||
**根因**: watchdog 没杀旧进程就启动新进程
|
||||
|
||||
**已修复**: watchdog `start_bot()` 先 kill 旧进程 + `proc_guard` PID 锁
|
||||
|
||||
---
|
||||
|
||||
## 数据管理
|
||||
|
||||
```bash
|
||||
# Linux 端 — Hermes session 归档
|
||||
cd ~/.hermes/profiles/main/
|
||||
cp state.db state.db.$(date +%Y%m%d)
|
||||
hermes session prune --older-than 30d
|
||||
```
|
||||
Reference in New Issue
Block a user