Metadata-Version: 2.4 Name: alphasift Version: 0.2.0 Summary: 自动选股 Skill — 从全市场中按策略筛选、评分、排序候选股票 License-Expression: Apache-2.0 Requires-Python: >=3.10 Description-Content-Type: text/markdown License-File: LICENSE Requires-Dist: pandas>=2.0 Requires-Dist: pyyaml>=6.0 Requires-Dist: litellm>=1.0 Requires-Dist: efinance>=0.4 Requires-Dist: akshare>=1.10 Requires-Dist: baostock>=0.8.9 Requires-Dist: tushare>=1.4 Requires-Dist: yfinance>=0.2 Requires-Dist: requests>=2.28 Provides-Extra: dev Requires-Dist: pytest; extra == "dev" Requires-Dist: ruff; extra == "dev" Dynamic: license-file # AlphaSift AlphaSift is an agent-friendly stock discovery and ranking engine. It scans a broad market universe, applies auditable YAML strategies, enriches candidates with optional market context, ranks them with deterministic factors and optional LLM judgment, and saves runs for later T+N evaluation. > This README is the default English version. A Chinese version is available at [README.zh-CN.md](README.zh-CN.md). ## Disclaimer - This project is for learning, research, and engineering experiments only. - It is not investment advice, a return guarantee, or a buy/sell instruction. - Outputs depend on third-party market data, optional LLM providers, local configuration, and strategy parameters. They can be delayed, incomplete, wrong, or unsuitable for real trading. - Users are responsible for independent research, compliance checks, transaction costs, liquidity risks, announcement timing, and all resulting decisions. ## What AlphaSift does - **L1 deterministic screening**: hard filters and factor scoring over the full market snapshot. - **L2 optional LLM ranking**: structured cross-candidate reasoning, theses, catalysts, risks, confidence, and portfolio risk buckets. - **L3 pluggable post-analysis**: local scorecard by default, with optional DSA or external HTTP analyzers. - **Hotspot discovery**: topic/sector heat ranking, hotspot detail resolution, leader stock fallbacks, cache quality metadata, and history sidecars. - **Daily feature enrichment**: optional candidate-level daily K-line features such as moving averages, MACD/RSI, breakout strength, volume ratio, pullback distance, and platform duration. - **Evaluation loop**: save runs, evaluate later using newer snapshots, deduct transaction cost, tag follow-through / failed-breakout outcomes, and optionally fetch price paths for max drawdown / max favorable excursion. - **Agent-native interface**: `SKILL.md` describes capabilities and callable interfaces for AI agents. ## Quick start ```bash # Install in editable mode pip install -e . # Copy configuration template cp .env.example .env # Edit .env if you want LLM ranking: # GEMINI_API_KEY / OPENAI_API_KEY / DEEPSEEK_API_KEY # or LITELLM_MODEL / LLM_CHANNELS / LITELLM_CONFIG # List built-in strategies alphasift strategies # Run the no-key demo alphasift quickstart # Screen without LLM ranking alphasift screen dual_low --no-llm # Screen with LLM ranking, if a provider key is configured alphasift screen dual_low # Reuse another project's environment file alphasift --env-file /home/ubuntu/daily_ai_assistant/.env screen balanced_alpha # Add market or theme context to the LLM prompt alphasift screen balanced_alpha --context "Brokerage names are seeing volume expansion today." # Add candidate-level news / announcement / fund-flow context alphasift screen balanced_alpha --candidate-context-file candidate_context.csv # Show local L3 scorecard explanations alphasift screen balanced_alpha --explain # Add DSA as an optional L3 analyzer; requires DSA_API_URL alphasift screen dual_low --post-analyzer dsa # Disable L3 post-analysis explicitly alphasift screen dual_low --no-post-analysis # Audit project and strategy configuration alphasift audit ``` Example output shape: ```text $ alphasift screen dual_low --no-llm Universe 5190 -> filtered 337 -> output Top 5 rank code name score price change pe pb 1 002039 黔源电力 72.7 20.72 -2.49% 14.76 1.99 2 002444 巨星科技 71.0 30.82 +0.29% 14.59 1.95 3 002128 电投能源 70.9 31.60 -2.41% 14.00 1.90 ``` ## Screening examples The following recorded examples were run on April 12, 2026, using the previous trading day's A-share close data from April 10, 2026. LLM ranking was disabled with `--no-llm`; these rows are examples of engine output, not recommendations. ### Dual Low Full market 5190 stocks -> 337 after hard filters -> Top 5 output. | Rank | Code | Name | Score | Price | Change | PE | PB | |---:|---|---|---:|---:|---:|---:|---:| | 1 | 002039 | 黔源电力 | 72.7 | 20.72 | -2.49% | 14.76 | 1.99 | | 2 | 002444 | 巨星科技 | 71.0 | 30.82 | +0.29% | 14.59 | 1.95 | | 3 | 002128 | 电投能源 | 70.9 | 31.60 | -2.41% | 14.00 | 1.90 | | 4 | 002236 | 大华股份 | 70.8 | 17.43 | +1.04% | 14.86 | 1.50 | | 5 | 600583 | 海油工程 | 68.9 | 7.02 | +4.15% | 14.89 | 1.17 | ### Volume Breakout Full market 5190 stocks -> 126 after hard filters -> Top 5 output. | Rank | Code | Name | Score | Price | Change | |---:|---|---|---:|---:|---:| | 1 | 002837 | 英维克 | 74.0 | 99.05 | +6.40% | | 2 | 688183 | 生益电子 | 73.8 | 95.30 | +7.09% | | 3 | 300803 | 指南针 | 73.3 | 101.68 | +3.07% | | 4 | 002384 | 东山精密 | 73.0 | 143.55 | +8.83% | | 5 | 300277 | 汽轮科技 | 73.0 | 19.74 | +5.73% | ## Hotspot workflow AlphaSift can discover current market hotspots and resolve a specific topic into a detail payload with raw timeline evidence, compact display-ready route events, leader stocks, source confidence, stale/fallback metadata, and quality diagnostics. ```bash # Discover hotspot topics and write schema_version=2 cache/history sidecars alphasift hotspots --provider akshare --top 12 --output data/hotspots.json --history data/hotspot.history.jsonl --explain # Inspect a single hotspot topic alphasift hotspot "AI compute" --top-stocks 10 --timeline --fallback-cache data/hotspots.json --explain # Safe offline/no-network check alphasift hotspots --provider none --explain ``` Hotspot cache files include: - `schema_version`: currently `2` - `generated_at` - `metadata`: provider, row count, source errors, stale/fallback state - `hotspots`: normalized topic rows - sidecars such as `*.meta.json` and JSONL history when requested Leader stock fallbacks are intentionally explicit. When live constituent APIs fail and AlphaSift uses last-good/cache leaders, returned stocks carry fields such as `source="last_good_cache.leader_stocks"`, `source_confidence`, and `fallback_used=true` instead of pretending to be live provider data. Hotspot details keep the raw `timeline` for auditability and also expose a compact `route` list for applications. `route` is grouped by day, newest first, trimmed for UI display, and falls back to a short current heat/stage/leader summary when no timeline evidence is available. ## Python API ```python from alphasift import screen result = screen("dual_low", use_llm=False) for pick in result.picks: print(f"{pick.rank}. {pick.code} {pick.name} score={pick.final_score:.1f}") ``` Saved-run evaluation helpers are also exported: ```python from alphasift import evaluate_saved_run, evaluate_saved_runs ``` ## Configuration AlphaSift is designed to reuse LiteLLM-style configuration used by `daily_stock_analysis` and similar projects. | Variable | Required | Description | Default | |---|---:|---|---| | `LITELLM_MODEL` | Recommended | Main model in `provider/model` format | `gemini/gemini-2.5-flash` | | `LITELLM_FALLBACK_MODELS` | No | Comma-separated fallback models | - | | `LLM_CHANNELS` | No | Multi-channel provider config using `LLM_{NAME}_*` | - | | `LITELLM_CONFIG` | No | LiteLLM Router YAML file | - | | `GEMINI_API_KEY` / `OPENAI_API_KEY` / `DEEPSEEK_API_KEY` | For LLM ranking | Provider API key | - | | `OPENAI_BASE_URL` / `OLLAMA_API_BASE` | No | OpenAI-compatible or Ollama endpoint | - | | `LLM_MAX_TOKENS` | No | Max tokens requested from LLM ranking; keeps local servers from generating unbounded output after client timeout | `2048` | | `LLM_CONTEXT` | No | Extra market/theme context for LLM ranking | - | | `LLM_CANDIDATE_CONTEXT_ENABLED` | No | Fetch candidate news/announcements/fund-flow context by default | `false` | | `INDUSTRY_MAP_FILES` | No | Local code-to-industry/concepts/board-heat files | - | | `INDUSTRY_PROVIDER` | No | Optional board/industry provider such as `akshare` | `none` | | `SNAPSHOT_SOURCE_PRIORITY` | No | Snapshot source order | Depends on Tushare token | | `SNAPSHOT_FALLBACK_MAX_AGE_HOURS` | No | Max acceptable age for last-good snapshot fallback; empty disables the guard | - | | `TUSHARE_TOKEN` / `TUSHARE_API_TOKEN` | For Tushare | Tushare Pro token | - | | `POST_ANALYZERS` | No | L3 analyzers; set `none` to disable | `scorecard` | | `DSA_API_URL` | For DSA analyzer | DSA service URL or full analysis endpoint | - | | `DAILY_ENRICH_ENABLED` | No | Enable candidate-level daily K-line enrichment | `false` | | `DAILY_SOURCE` | No | Daily K-line source: `auto`, `tencent`, `sina`, `akshare`, `baostock`, or `tushare` | `auto` | | `ALPHASIFT_DATA_DIR` | No | Run records, caches, and evaluation results | `./data` | | `STRATEGIES_DIR` | No | Custom strategy directory | auto-detect | Example multi-channel LiteLLM config: ```env LLM_CHANNELS=primary LLM_PRIMARY_PROTOCOL=openai LLM_PRIMARY_BASE_URL=https://api.deepseek.com/v1 LLM_PRIMARY_API_KEYS=sk-xxx,sk-yyy LLM_PRIMARY_MODELS=deepseek-chat,deepseek-reasoner LITELLM_MODEL=openai/deepseek-chat LITELLM_FALLBACK_MODELS=openai/gpt-4o-mini,anthropic/claude-3-5-sonnet ``` Example single-provider config: ```env GEMINI_API_KEY=... LITELLM_MODEL=gemini/gemini-2.5-flash ``` You can load external `.env` files repeatedly: ```bash alphasift --env-file /path/to/daily_stock_analysis/.env \ --env-file /path/to/daily_ai_assistant/.env \ screen balanced_alpha ``` For the full configuration reference, see [docs/configuration.md](docs/configuration.md). ## Data sources AlphaSift supports multiple A-share market snapshot sources and automatically falls back by priority. Default without Tushare token: ```text sina -> efinance -> akshare_em -> em_datacenter ``` Default with `TUSHARE_TOKEN` / `TUSHARE_API_TOKEN` and no manual priority override: ```text tushare -> sina -> efinance -> akshare_em -> em_datacenter ``` | Source | Backend | Notes | |---|---|---| | `sina` | Sina Finance Market Center | Direct HTTP full-market source with PE/PB/turnover/market-cap fields | | `efinance` | Eastmoney push2 | Fast during live sessions | | `akshare_em` | Eastmoney push endpoint via AkShare-style access | Backup live source | | `em_datacenter` | Eastmoney Data Center | Often available outside trading hours | | `tushare` | Tushare Pro `daily` + `daily_basic` | Requires token; previous/nearest trading day data | Daily K-line enrichment defaults to `DAILY_SOURCE=auto`. The auto chain uses `tushare -> tencent -> sina -> akshare -> baostock` when a Tushare token is configured, otherwise `tencent -> sina -> akshare -> baostock`. Tencent is a direct HTTP K-line source with no wrapper dependency and is preferred over Eastmoney-heavy wrapper paths for candidate-level history enrichment; Sina provides a second direct HTTP fallback before wrapper sources. Repeatedly failing sources are temporarily skipped, and expired daily cache can be used as a marked stale fallback when every live daily source fails. Source support matrix: | Capability | Primary chain | Fields | |---|---|---| | Daily K-line enrichment | `tushare` when token exists, then `tencent`, `sina`, `akshare`, `baostock` with health-aware auto reordering | OHLCV, qfq where supported, technical factors, 20d volatility/ATR/drawdown controls, per-row `daily_source` provenance, `daily_quality_score`/flags, source-health stats | | Full-market snapshot | `sina`, then `efinance`, `akshare_em`, `em_datacenter`; `tushare` first when token exists | price, change, amount, market cap, PE/PB, turnover | | Candidate context | `news`, `fund_flow`, `announcement`, `quote` | news, announcements, fund flow, Tencent quote valuation/turnover | | Last-good fallback | daily history cache and snapshot cache | marked with stale/fallback attrs when live sources fail | If a source is unavailable or lacks fields required by a strategy, AlphaSift skips it and tries the next source. Eastmoney-only HTTP fallbacks use a shared throttled session to reduce connection churn and bursty access. If all live sources fail, the last-good snapshot fallback is explicitly marked as stale/fallback data; `SNAPSHOT_FALLBACK_MAX_AGE_HOURS` can reject overly old fallback cache to avoid repeating stale selections. ## Built-in strategies | Strategy | Type | Description | |---|---|---| | `dual_low` | Value | Low PE + low PB defensive value screen | | `volume_breakout` | Trend | Volume expansion and resistance breakout | | `quality_value` | Value | Reasonable valuation, liquidity, and controlled volatility | | `capital_heat` | Momentum | Active capital flow without extreme overheating | | `oversold_reversal` | Reversal | Repair candidates with controlled drawdown and still-valid liquidity | | `balanced_alpha` | Framework | General multi-factor discovery strategy | | `momentum_quality` | Framework | Trend confirmation plus quality filters | | `shrink_pullback` | Trend | Pullback into support during a broader uptrend; uses daily enrichment | Add custom YAML strategies under `strategies/`. See [docs/strategy-guide.md](docs/strategy-guide.md). ## Project layout ```text alphasift/ ├── SKILL.md # Agent skill description and callable interface ├── README.zh-CN.md # Chinese README ├── strategies/ # Strategy YAML files ├── docs/ │ ├── configuration.md # Configuration reference │ ├── design.md # Design principles │ ├── positioning.md # Product positioning │ ├── reference.md # Structure, boundaries, observed runs │ ├── scoring.md # Scoring details │ ├── strategy-guide.md # Strategy authoring guide │ └── usage.md # Usage guide └── alphasift/ # Python package ├── cli.py # CLI entry point ├── config.py # Environment configuration ├── context.py # LLM context assembly ├── candidate_context.py # Candidate news/announcement/fund-flow context ├── daily.py # Daily K-line feature enrichment ├── hotspot.py # Hotspot discovery/detail/cache contract ├── industry.py # Industry/concept/board heat mapping ├── models.py # Data models ├── snapshot.py # Market snapshot loading and fallback ├── filter.py # L1 hard filters ├── scorer.py # Factor scoring ├── ranker.py # L2 LLM ranking ├── risk.py # Independent risk layer ├── post_analysis.py # L3 post-analysis plugins ├── dsa.py # Optional DSA integration ├── store.py # Run persistence ├── evaluate.py # T+N evaluation ├── pipeline.py # Main orchestration └── strategy.py # Strategy YAML loader ``` ## Relationship with daily_stock_analysis `daily_stock_analysis` (DSA) is an external single-stock deep-analysis service. AlphaSift is upstream: it discovers and ranks candidates across the market. DSA is downstream: it can analyze a small final shortlist in depth. - AlphaSift does broad discovery, deterministic scoring, LLM ranking, hotspot analysis, and saved-run evaluation. - DSA does individual stock deep analysis through its own API, usually `POST /api/v1/analysis/analyze`. - The integration is optional and configured through `DSA_API_URL`. - To control cost and latency, AlphaSift only calls DSA for final selected candidates. - The default L3 analyzer is local `scorecard`; DSA and external HTTP analyzers are optional. ## Known limitations - Strategies that depend on daily K-line features enrich only the L1 top candidates, not the entire historical market. - AlphaSift is not a full backtesting engine or portfolio execution system. - DSA post-analysis is synchronous and better suited to low-frequency final-candidate review. - Tushare fallback depends on the user's own token, point balance, and permissions. - T+N evaluation compares saved run prices with later snapshots; it is not a rigorous event-study backtest and does not model dividends, suspensions, slippage, or rebalancing constraints. - The repository keeps both `strategies/` and `alphasift/strategies/` mirrors for development and packaged usage; built-in strategy files should stay in sync. ## Verification Last recorded full-suite check: ```text $ python -m pytest -q 176 passed, 1 skipped in 1.56s ``` ## Documentation - [SKILL.md](SKILL.md) — agent skill description and function interface - [README.zh-CN.md](README.zh-CN.md) — Chinese README - [docs/usage.md](docs/usage.md) — usage guide - [docs/configuration.md](docs/configuration.md) — configuration reference - [docs/positioning.md](docs/positioning.md) — positioning and relative advantages - [docs/comparison.md](docs/comparison.md) — comparison, gaps, and priorities - [docs/design.md](docs/design.md) — design principles - [docs/reference.md](docs/reference.md) — structure, data-source boundaries, observed runs - [docs/scoring.md](docs/scoring.md) — scoring system details - [docs/strategy-guide.md](docs/strategy-guide.md) — custom strategy guide ## License Apache License 2.0