fa45d8aa5f
- health_checklist.json: 192.168.1.122→node122
- ocr_client.py: docstring IP→node122
- docs/market-data-requirements.md: IP→node122
- 所有API调用通过ProxyHandler({})绕过系统代理
Privoxy对node122:18003返回500,直连正常
363 lines
18 KiB
Plaintext
363 lines
18 KiB
Plaintext
Metadata-Version: 2.4
|
|
Name: alphasift
|
|
Version: 0.2.0
|
|
Summary: 自动选股 Skill — 从全市场中按策略筛选、评分、排序候选股票
|
|
License-Expression: Apache-2.0
|
|
Requires-Python: >=3.10
|
|
Description-Content-Type: text/markdown
|
|
License-File: LICENSE
|
|
Requires-Dist: pandas>=2.0
|
|
Requires-Dist: pyyaml>=6.0
|
|
Requires-Dist: litellm>=1.0
|
|
Requires-Dist: efinance>=0.4
|
|
Requires-Dist: akshare>=1.10
|
|
Requires-Dist: baostock>=0.8.9
|
|
Requires-Dist: tushare>=1.4
|
|
Requires-Dist: yfinance>=0.2
|
|
Requires-Dist: requests>=2.28
|
|
Provides-Extra: dev
|
|
Requires-Dist: pytest; extra == "dev"
|
|
Requires-Dist: ruff; extra == "dev"
|
|
Dynamic: license-file
|
|
|
|
# AlphaSift
|
|
|
|
AlphaSift is an agent-friendly stock discovery and ranking engine. It scans a broad market universe, applies auditable YAML strategies, enriches candidates with optional market context, ranks them with deterministic factors and optional LLM judgment, and saves runs for later T+N evaluation.
|
|
|
|
> This README is the default English version. A Chinese version is available at [README.zh-CN.md](README.zh-CN.md).
|
|
|
|
## Disclaimer
|
|
|
|
- This project is for learning, research, and engineering experiments only.
|
|
- It is not investment advice, a return guarantee, or a buy/sell instruction.
|
|
- Outputs depend on third-party market data, optional LLM providers, local configuration, and strategy parameters. They can be delayed, incomplete, wrong, or unsuitable for real trading.
|
|
- Users are responsible for independent research, compliance checks, transaction costs, liquidity risks, announcement timing, and all resulting decisions.
|
|
|
|
## What AlphaSift does
|
|
|
|
- **L1 deterministic screening**: hard filters and factor scoring over the full market snapshot.
|
|
- **L2 optional LLM ranking**: structured cross-candidate reasoning, theses, catalysts, risks, confidence, and portfolio risk buckets.
|
|
- **L3 pluggable post-analysis**: local scorecard by default, with optional DSA or external HTTP analyzers.
|
|
- **Hotspot discovery**: topic/sector heat ranking, hotspot detail resolution, leader stock fallbacks, cache quality metadata, and history sidecars.
|
|
- **Daily feature enrichment**: optional candidate-level daily K-line features such as moving averages, MACD/RSI, breakout strength, volume ratio, pullback distance, and platform duration.
|
|
- **Evaluation loop**: save runs, evaluate later using newer snapshots, deduct transaction cost, tag follow-through / failed-breakout outcomes, and optionally fetch price paths for max drawdown / max favorable excursion.
|
|
- **Agent-native interface**: `SKILL.md` describes capabilities and callable interfaces for AI agents.
|
|
|
|
## Quick start
|
|
|
|
```bash
|
|
# Install in editable mode
|
|
pip install -e .
|
|
|
|
# Copy configuration template
|
|
cp .env.example .env
|
|
# Edit .env if you want LLM ranking:
|
|
# GEMINI_API_KEY / OPENAI_API_KEY / DEEPSEEK_API_KEY
|
|
# or LITELLM_MODEL / LLM_CHANNELS / LITELLM_CONFIG
|
|
|
|
# List built-in strategies
|
|
alphasift strategies
|
|
|
|
# Run the no-key demo
|
|
alphasift quickstart
|
|
|
|
# Screen without LLM ranking
|
|
alphasift screen dual_low --no-llm
|
|
|
|
# Screen with LLM ranking, if a provider key is configured
|
|
alphasift screen dual_low
|
|
|
|
# Reuse another project's environment file
|
|
alphasift --env-file /home/ubuntu/daily_ai_assistant/.env screen balanced_alpha
|
|
|
|
# Add market or theme context to the LLM prompt
|
|
alphasift screen balanced_alpha --context "Brokerage names are seeing volume expansion today."
|
|
|
|
# Add candidate-level news / announcement / fund-flow context
|
|
alphasift screen balanced_alpha --candidate-context-file candidate_context.csv
|
|
|
|
# Show local L3 scorecard explanations
|
|
alphasift screen balanced_alpha --explain
|
|
|
|
# Add DSA as an optional L3 analyzer; requires DSA_API_URL
|
|
alphasift screen dual_low --post-analyzer dsa
|
|
|
|
# Disable L3 post-analysis explicitly
|
|
alphasift screen dual_low --no-post-analysis
|
|
|
|
# Audit project and strategy configuration
|
|
alphasift audit
|
|
```
|
|
|
|
Example output shape:
|
|
|
|
```text
|
|
$ alphasift screen dual_low --no-llm
|
|
Universe 5190 -> filtered 337 -> output Top 5
|
|
rank code name score price change pe pb
|
|
1 002039 黔源电力 72.7 20.72 -2.49% 14.76 1.99
|
|
2 002444 巨星科技 71.0 30.82 +0.29% 14.59 1.95
|
|
3 002128 电投能源 70.9 31.60 -2.41% 14.00 1.90
|
|
```
|
|
|
|
## Screening examples
|
|
|
|
The following recorded examples were run on April 12, 2026, using the previous trading day's A-share close data from April 10, 2026. LLM ranking was disabled with `--no-llm`; these rows are examples of engine output, not recommendations.
|
|
|
|
### Dual Low
|
|
|
|
Full market 5190 stocks -> 337 after hard filters -> Top 5 output.
|
|
|
|
| Rank | Code | Name | Score | Price | Change | PE | PB |
|
|
|---:|---|---|---:|---:|---:|---:|---:|
|
|
| 1 | 002039 | 黔源电力 | 72.7 | 20.72 | -2.49% | 14.76 | 1.99 |
|
|
| 2 | 002444 | 巨星科技 | 71.0 | 30.82 | +0.29% | 14.59 | 1.95 |
|
|
| 3 | 002128 | 电投能源 | 70.9 | 31.60 | -2.41% | 14.00 | 1.90 |
|
|
| 4 | 002236 | 大华股份 | 70.8 | 17.43 | +1.04% | 14.86 | 1.50 |
|
|
| 5 | 600583 | 海油工程 | 68.9 | 7.02 | +4.15% | 14.89 | 1.17 |
|
|
|
|
### Volume Breakout
|
|
|
|
Full market 5190 stocks -> 126 after hard filters -> Top 5 output.
|
|
|
|
| Rank | Code | Name | Score | Price | Change |
|
|
|---:|---|---|---:|---:|---:|
|
|
| 1 | 002837 | 英维克 | 74.0 | 99.05 | +6.40% |
|
|
| 2 | 688183 | 生益电子 | 73.8 | 95.30 | +7.09% |
|
|
| 3 | 300803 | 指南针 | 73.3 | 101.68 | +3.07% |
|
|
| 4 | 002384 | 东山精密 | 73.0 | 143.55 | +8.83% |
|
|
| 5 | 300277 | 汽轮科技 | 73.0 | 19.74 | +5.73% |
|
|
|
|
## Hotspot workflow
|
|
|
|
AlphaSift can discover current market hotspots and resolve a specific topic into a detail payload with raw timeline evidence, compact display-ready route events, leader stocks, source confidence, stale/fallback metadata, and quality diagnostics.
|
|
|
|
```bash
|
|
# Discover hotspot topics and write schema_version=2 cache/history sidecars
|
|
alphasift hotspots --provider akshare --top 12 --output data/hotspots.json --history data/hotspot.history.jsonl --explain
|
|
|
|
# Inspect a single hotspot topic
|
|
alphasift hotspot "AI compute" --top-stocks 10 --timeline --fallback-cache data/hotspots.json --explain
|
|
|
|
# Safe offline/no-network check
|
|
alphasift hotspots --provider none --explain
|
|
```
|
|
|
|
Hotspot cache files include:
|
|
|
|
- `schema_version`: currently `2`
|
|
- `generated_at`
|
|
- `metadata`: provider, row count, source errors, stale/fallback state
|
|
- `hotspots`: normalized topic rows
|
|
- sidecars such as `*.meta.json` and JSONL history when requested
|
|
|
|
Leader stock fallbacks are intentionally explicit. When live constituent APIs fail and AlphaSift uses last-good/cache leaders, returned stocks carry fields such as `source="last_good_cache.leader_stocks"`, `source_confidence`, and `fallback_used=true` instead of pretending to be live provider data.
|
|
|
|
Hotspot details keep the raw `timeline` for auditability and also expose a compact `route` list for applications. `route` is grouped by day, newest first, trimmed for UI display, and falls back to a short current heat/stage/leader summary when no timeline evidence is available.
|
|
|
|
## Python API
|
|
|
|
```python
|
|
from alphasift import screen
|
|
|
|
result = screen("dual_low", use_llm=False)
|
|
for pick in result.picks:
|
|
print(f"{pick.rank}. {pick.code} {pick.name} score={pick.final_score:.1f}")
|
|
```
|
|
|
|
Saved-run evaluation helpers are also exported:
|
|
|
|
```python
|
|
from alphasift import evaluate_saved_run, evaluate_saved_runs
|
|
```
|
|
|
|
## Configuration
|
|
|
|
AlphaSift is designed to reuse LiteLLM-style configuration used by `daily_stock_analysis` and similar projects.
|
|
|
|
| Variable | Required | Description | Default |
|
|
|---|---:|---|---|
|
|
| `LITELLM_MODEL` | Recommended | Main model in `provider/model` format | `gemini/gemini-2.5-flash` |
|
|
| `LITELLM_FALLBACK_MODELS` | No | Comma-separated fallback models | - |
|
|
| `LLM_CHANNELS` | No | Multi-channel provider config using `LLM_{NAME}_*` | - |
|
|
| `LITELLM_CONFIG` | No | LiteLLM Router YAML file | - |
|
|
| `GEMINI_API_KEY` / `OPENAI_API_KEY` / `DEEPSEEK_API_KEY` | For LLM ranking | Provider API key | - |
|
|
| `OPENAI_BASE_URL` / `OLLAMA_API_BASE` | No | OpenAI-compatible or Ollama endpoint | - |
|
|
| `LLM_MAX_TOKENS` | No | Max tokens requested from LLM ranking; keeps local servers from generating unbounded output after client timeout | `2048` |
|
|
| `LLM_CONTEXT` | No | Extra market/theme context for LLM ranking | - |
|
|
| `LLM_CANDIDATE_CONTEXT_ENABLED` | No | Fetch candidate news/announcements/fund-flow context by default | `false` |
|
|
| `INDUSTRY_MAP_FILES` | No | Local code-to-industry/concepts/board-heat files | - |
|
|
| `INDUSTRY_PROVIDER` | No | Optional board/industry provider such as `akshare` | `none` |
|
|
| `SNAPSHOT_SOURCE_PRIORITY` | No | Snapshot source order | Depends on Tushare token |
|
|
| `SNAPSHOT_FALLBACK_MAX_AGE_HOURS` | No | Max acceptable age for last-good snapshot fallback; empty disables the guard | - |
|
|
| `TUSHARE_TOKEN` / `TUSHARE_API_TOKEN` | For Tushare | Tushare Pro token | - |
|
|
| `POST_ANALYZERS` | No | L3 analyzers; set `none` to disable | `scorecard` |
|
|
| `DSA_API_URL` | For DSA analyzer | DSA service URL or full analysis endpoint | - |
|
|
| `DAILY_ENRICH_ENABLED` | No | Enable candidate-level daily K-line enrichment | `false` |
|
|
| `DAILY_SOURCE` | No | Daily K-line source: `auto`, `tencent`, `sina`, `akshare`, `baostock`, or `tushare` | `auto` |
|
|
| `ALPHASIFT_DATA_DIR` | No | Run records, caches, and evaluation results | `./data` |
|
|
| `STRATEGIES_DIR` | No | Custom strategy directory | auto-detect |
|
|
|
|
Example multi-channel LiteLLM config:
|
|
|
|
```env
|
|
LLM_CHANNELS=primary
|
|
LLM_PRIMARY_PROTOCOL=openai
|
|
LLM_PRIMARY_BASE_URL=https://api.deepseek.com/v1
|
|
LLM_PRIMARY_API_KEYS=sk-xxx,sk-yyy
|
|
LLM_PRIMARY_MODELS=deepseek-chat,deepseek-reasoner
|
|
LITELLM_MODEL=openai/deepseek-chat
|
|
LITELLM_FALLBACK_MODELS=openai/gpt-4o-mini,anthropic/claude-3-5-sonnet
|
|
```
|
|
|
|
Example single-provider config:
|
|
|
|
```env
|
|
GEMINI_API_KEY=...
|
|
LITELLM_MODEL=gemini/gemini-2.5-flash
|
|
```
|
|
|
|
You can load external `.env` files repeatedly:
|
|
|
|
```bash
|
|
alphasift --env-file /path/to/daily_stock_analysis/.env \
|
|
--env-file /path/to/daily_ai_assistant/.env \
|
|
screen balanced_alpha
|
|
```
|
|
|
|
For the full configuration reference, see [docs/configuration.md](docs/configuration.md).
|
|
|
|
## Data sources
|
|
|
|
AlphaSift supports multiple A-share market snapshot sources and automatically falls back by priority.
|
|
|
|
Default without Tushare token:
|
|
|
|
```text
|
|
sina -> efinance -> akshare_em -> em_datacenter
|
|
```
|
|
|
|
Default with `TUSHARE_TOKEN` / `TUSHARE_API_TOKEN` and no manual priority override:
|
|
|
|
```text
|
|
tushare -> sina -> efinance -> akshare_em -> em_datacenter
|
|
```
|
|
|
|
| Source | Backend | Notes |
|
|
|---|---|---|
|
|
| `sina` | Sina Finance Market Center | Direct HTTP full-market source with PE/PB/turnover/market-cap fields |
|
|
| `efinance` | Eastmoney push2 | Fast during live sessions |
|
|
| `akshare_em` | Eastmoney push endpoint via AkShare-style access | Backup live source |
|
|
| `em_datacenter` | Eastmoney Data Center | Often available outside trading hours |
|
|
| `tushare` | Tushare Pro `daily` + `daily_basic` | Requires token; previous/nearest trading day data |
|
|
|
|
Daily K-line enrichment defaults to `DAILY_SOURCE=auto`. The auto chain uses `tushare -> tencent -> sina -> akshare -> baostock` when a Tushare token is configured, otherwise `tencent -> sina -> akshare -> baostock`. Tencent is a direct HTTP K-line source with no wrapper dependency and is preferred over Eastmoney-heavy wrapper paths for candidate-level history enrichment; Sina provides a second direct HTTP fallback before wrapper sources. Repeatedly failing sources are temporarily skipped, and expired daily cache can be used as a marked stale fallback when every live daily source fails.
|
|
|
|
Source support matrix:
|
|
|
|
| Capability | Primary chain | Fields |
|
|
|---|---|---|
|
|
| Daily K-line enrichment | `tushare` when token exists, then `tencent`, `sina`, `akshare`, `baostock` with health-aware auto reordering | OHLCV, qfq where supported, technical factors, 20d volatility/ATR/drawdown controls, per-row `daily_source` provenance, `daily_quality_score`/flags, source-health stats |
|
|
| Full-market snapshot | `sina`, then `efinance`, `akshare_em`, `em_datacenter`; `tushare` first when token exists | price, change, amount, market cap, PE/PB, turnover |
|
|
| Candidate context | `news`, `fund_flow`, `announcement`, `quote` | news, announcements, fund flow, Tencent quote valuation/turnover |
|
|
| Last-good fallback | daily history cache and snapshot cache | marked with stale/fallback attrs when live sources fail |
|
|
|
|
If a source is unavailable or lacks fields required by a strategy, AlphaSift skips it and tries the next source. Eastmoney-only HTTP fallbacks use a shared throttled session to reduce connection churn and bursty access. If all live sources fail, the last-good snapshot fallback is explicitly marked as stale/fallback data; `SNAPSHOT_FALLBACK_MAX_AGE_HOURS` can reject overly old fallback cache to avoid repeating stale selections.
|
|
|
|
## Built-in strategies
|
|
|
|
| Strategy | Type | Description |
|
|
|---|---|---|
|
|
| `dual_low` | Value | Low PE + low PB defensive value screen |
|
|
| `volume_breakout` | Trend | Volume expansion and resistance breakout |
|
|
| `quality_value` | Value | Reasonable valuation, liquidity, and controlled volatility |
|
|
| `capital_heat` | Momentum | Active capital flow without extreme overheating |
|
|
| `oversold_reversal` | Reversal | Repair candidates with controlled drawdown and still-valid liquidity |
|
|
| `balanced_alpha` | Framework | General multi-factor discovery strategy |
|
|
| `momentum_quality` | Framework | Trend confirmation plus quality filters |
|
|
| `shrink_pullback` | Trend | Pullback into support during a broader uptrend; uses daily enrichment |
|
|
|
|
Add custom YAML strategies under `strategies/`. See [docs/strategy-guide.md](docs/strategy-guide.md).
|
|
|
|
## Project layout
|
|
|
|
```text
|
|
alphasift/
|
|
├── SKILL.md # Agent skill description and callable interface
|
|
├── README.zh-CN.md # Chinese README
|
|
├── strategies/ # Strategy YAML files
|
|
├── docs/
|
|
│ ├── configuration.md # Configuration reference
|
|
│ ├── design.md # Design principles
|
|
│ ├── positioning.md # Product positioning
|
|
│ ├── reference.md # Structure, boundaries, observed runs
|
|
│ ├── scoring.md # Scoring details
|
|
│ ├── strategy-guide.md # Strategy authoring guide
|
|
│ └── usage.md # Usage guide
|
|
└── alphasift/ # Python package
|
|
├── cli.py # CLI entry point
|
|
├── config.py # Environment configuration
|
|
├── context.py # LLM context assembly
|
|
├── candidate_context.py # Candidate news/announcement/fund-flow context
|
|
├── daily.py # Daily K-line feature enrichment
|
|
├── hotspot.py # Hotspot discovery/detail/cache contract
|
|
├── industry.py # Industry/concept/board heat mapping
|
|
├── models.py # Data models
|
|
├── snapshot.py # Market snapshot loading and fallback
|
|
├── filter.py # L1 hard filters
|
|
├── scorer.py # Factor scoring
|
|
├── ranker.py # L2 LLM ranking
|
|
├── risk.py # Independent risk layer
|
|
├── post_analysis.py # L3 post-analysis plugins
|
|
├── dsa.py # Optional DSA integration
|
|
├── store.py # Run persistence
|
|
├── evaluate.py # T+N evaluation
|
|
├── pipeline.py # Main orchestration
|
|
└── strategy.py # Strategy YAML loader
|
|
```
|
|
|
|
## Relationship with daily_stock_analysis
|
|
|
|
`daily_stock_analysis` (DSA) is an external single-stock deep-analysis service. AlphaSift is upstream: it discovers and ranks candidates across the market. DSA is downstream: it can analyze a small final shortlist in depth.
|
|
|
|
- AlphaSift does broad discovery, deterministic scoring, LLM ranking, hotspot analysis, and saved-run evaluation.
|
|
- DSA does individual stock deep analysis through its own API, usually `POST /api/v1/analysis/analyze`.
|
|
- The integration is optional and configured through `DSA_API_URL`.
|
|
- To control cost and latency, AlphaSift only calls DSA for final selected candidates.
|
|
- The default L3 analyzer is local `scorecard`; DSA and external HTTP analyzers are optional.
|
|
|
|
## Known limitations
|
|
|
|
- Strategies that depend on daily K-line features enrich only the L1 top candidates, not the entire historical market.
|
|
- AlphaSift is not a full backtesting engine or portfolio execution system.
|
|
- DSA post-analysis is synchronous and better suited to low-frequency final-candidate review.
|
|
- Tushare fallback depends on the user's own token, point balance, and permissions.
|
|
- T+N evaluation compares saved run prices with later snapshots; it is not a rigorous event-study backtest and does not model dividends, suspensions, slippage, or rebalancing constraints.
|
|
- The repository keeps both `strategies/` and `alphasift/strategies/` mirrors for development and packaged usage; built-in strategy files should stay in sync.
|
|
|
|
## Verification
|
|
|
|
Last recorded full-suite check:
|
|
|
|
```text
|
|
$ python -m pytest -q
|
|
176 passed, 1 skipped in 1.56s
|
|
```
|
|
|
|
## Documentation
|
|
|
|
- [SKILL.md](SKILL.md) — agent skill description and function interface
|
|
- [README.zh-CN.md](README.zh-CN.md) — Chinese README
|
|
- [docs/usage.md](docs/usage.md) — usage guide
|
|
- [docs/configuration.md](docs/configuration.md) — configuration reference
|
|
- [docs/positioning.md](docs/positioning.md) — positioning and relative advantages
|
|
- [docs/comparison.md](docs/comparison.md) — comparison, gaps, and priorities
|
|
- [docs/design.md](docs/design.md) — design principles
|
|
- [docs/reference.md](docs/reference.md) — structure, data-source boundaries, observed runs
|
|
- [docs/scoring.md](docs/scoring.md) — scoring system details
|
|
- [docs/strategy-guide.md](docs/strategy-guide.md) — custom strategy guide
|
|
|
|
## License
|
|
|
|
Apache License 2.0
|