chore(benchmark): translate finance e2e case prompts to English

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Jiayuan Zhang 2026-02-16 02:39:35 +08:00
parent c38a576b8f
commit 486b090577
10 changed files with 121 additions and 121 deletions

View file

@ -1,14 +1,14 @@
请完成一个高复杂度投研任务(中文输出):
Complete a high-complexity investment research task:
目标:分析美股市值排名前十公司在最近三个完整财年的财报与经营质量,并给出 2026 年2026-01-01 到 2026-12-31的投资建议。
Objective: Analyze the top 10 US stocks by market capitalization across their most recent three complete fiscal years and provide investment recommendations for 2026 (2026-01-01 to 2026-12-31).
硬性要求:
1. 以“截至 2026-02-01 的美股市值前十”为样本;如果个别公司数据窗口不完整,请注明并用“最近可得完整财年”替代。
2. 每家公司生成 1 份详细分析,至少包含:收入与利润结构、毛利率/营业利润率趋势、现金流质量、资本开支与回购分红、估值区间、主要风险。
3. 生成一个 Excel 文件(.xlsx至少包含`raw_data`、`company_scorecard`、`valuation`、`risk_matrix` 四个工作表。
4. 生成一份综合报告,对十家公司做横向比较与分层(核心持仓/观察/回避),并给出 2026 年组合建议(含仓位区间和触发条件)。
5. 单独输出 `sources.md`,列出关键数据来源链接与抓取时间。
6. 所有结果统一放到新目录:`~/Desktop/multica-fin-bench/case-01/`。
7. 如果无法直接生成 xlsx请说明原因并提供结构等价的 CSV 文件集。
Requirements:
1. Use "top 10 US stocks by market cap as of 2026-02-01" as the sample; if data windows are incomplete for certain companies, note this and substitute with the most recent available complete fiscal year.
2. Generate 1 detailed analysis per company, covering at minimum: revenue and profit structure, gross/operating margin trends, cash flow quality, capex and buybacks/dividends, valuation range, and key risks.
3. Generate an Excel file (.xlsx) with at least 4 sheets: `raw_data`, `company_scorecard`, `valuation`, `risk_matrix`.
4. Generate a comprehensive report with cross-company comparison and tiering (core holding / watchlist / avoid), along with 2026 portfolio recommendations (including position ranges and trigger conditions).
5. Output a separate `sources.md` listing key data source links and retrieval timestamps.
6. Place all output in a new directory: `~/Desktop/multica-fin-bench/case-01/`.
7. If unable to generate xlsx directly, explain why and provide structurally equivalent CSV files.
执行要求:先给出 8-12 步执行计划,再执行。最后做自检清单,确认文件齐全。
Execution requirements: First present an 8-12 step execution plan, then execute. Conclude with a self-check checklist confirming all files are complete.

View file

@ -1,15 +1,15 @@
请构建“AI 产业链基本面与估值评分”项目(中文输出)。
Build an "AI Value Chain Fundamentals & Valuation Scorecard" project.
股票池NVDA、AMD、AVGO、MSFT、GOOGL、AMZN、META、TSM、ASML、ANET。
时间范围2023-01-01 到 2025-12-31不足处用最近可得数据补齐并标记
Stock universe: NVDA, AMD, AVGO, MSFT, GOOGL, AMZN, META, TSM, ASML, ANET.
Time range: 2023-01-01 to 2025-12-31 (fill gaps with most recent available data and flag accordingly).
任务要求:
1. 构建一个 100 分制评分模型,至少包含 6 个维度:增长、盈利能力、资本效率、研发强度、现金流质量、估值安全边际。
2. 给出每个维度的权重与打分逻辑,必须可复现。
3. 生成 Excel.xlsx`input_data`、`factor_scores`、`weighted_rank`、`scenario_2026`。
4. 在 `scenario_2026` 中给出三种情景(乐观/基准/保守)下的目标区间与触发信号。
5. 产出 `investment_memo.md`(含前 3 名的建仓逻辑和后 3 名的回避逻辑)。
6. 产出 `sources.md`(来源链接 + 日期)。
7. 输出目录:`~/Desktop/multica-fin-bench/case-02/`。
Requirements:
1. Construct a 100-point scoring model with at least 6 dimensions: growth, profitability, capital efficiency, R&D intensity, cash flow quality, and valuation margin of safety.
2. Provide weights and scoring logic for each dimension; must be reproducible.
3. Generate an Excel file (.xlsx) with sheets: `input_data`, `factor_scores`, `weighted_rank`, `scenario_2026`.
4. In `scenario_2026`, provide target ranges and trigger signals under three scenarios (optimistic / base / conservative).
5. Produce `investment_memo.md` (including entry logic for the top 3 and avoidance logic for the bottom 3).
6. Produce `sources.md` (source links + dates).
7. Output directory: `~/Desktop/multica-fin-bench/case-02/`.
执行要求:先计划后执行;最后输出“可复现性检查”(别人按你的步骤是否能复现)。
Execution requirements: Plan before executing; conclude with a "reproducibility check" (can someone else reproduce your results following your steps).

View file

@ -1,16 +1,16 @@
请做“美国大型银行 2026 压力测试”任务(中文输出)。
Perform a "US Major Bank 2026 Stress Test" task.
样本JPM、BAC、C、WFC、GS、MS。
Sample: JPM, BAC, C, WFC, GS, MS.
任务要求:
1. 基于最近三个完整财年(优先 2023-2025整理关键指标净息差(NIM)、CET1、贷款损失准备、商业地产敞口、存款成本变化、未实现损失等。
2. 构建两套压力情景:
- Mild Recession:失业率上行 +150bp联邦基金利率下行 100bp
- Severe Recession:失业率上行 +300bp联邦基金利率下行 200bpCRE 违约率显著上行
3. 估算各银行在两种情景下的利润与资本充足率变化方向,并给出脆弱点排序。
4. 生成 Excel.xlsx`bank_raw`、`stress_assumptions`、`impact_estimate`、`ranking`。
5. 生成 `risk_brief.md`,包含“最需要警惕的 5 个风险信号”。
6. 生成 `sources.md`。
7. 输出到:`~/Desktop/multica-fin-bench/case-03/`。
Requirements:
1. Compile key metrics from the most recent three complete fiscal years (preferably 2023-2025): net interest margin (NIM), CET1, loan loss provisions, commercial real estate (CRE) exposure, deposit cost changes, unrealized losses, etc.
2. Construct two stress scenarios:
- Mild Recession: unemployment +150bp, federal funds rate -100bp
- Severe Recession: unemployment +300bp, federal funds rate -200bp, CRE default rate significantly higher
3. Estimate directional changes in profit and capital adequacy for each bank under both scenarios, and rank vulnerability.
4. Generate an Excel file (.xlsx) with sheets: `bank_raw`, `stress_assumptions`, `impact_estimate`, `ranking`.
5. Generate `risk_brief.md` containing "top 5 risk signals to watch."
6. Generate `sources.md`.
7. Output to: `~/Desktop/multica-fin-bench/case-03/`.
执行要求:先给方法论,再给结果;最后列出你最不确定的 3 个假设。
Execution requirements: Present methodology first, then results; conclude by listing the 3 assumptions you are least confident about.

View file

@ -1,15 +1,15 @@
请做“美国消费板块与宏观变量联动分析”(中文输出)。
Perform a "US Consumer Sector & Macro Variable Linkage Analysis."
样本公司WMT、COST、TGT、HD、LOW、MCD、SBUX、NKE、DIS、AMZN。
时间范围2023-01-01 到 2025-12-31。
Sample companies: WMT, COST, TGT, HD, LOW, MCD, SBUX, NKE, DIS, AMZN.
Time range: 2023-01-01 to 2025-12-31.
任务要求:
1. 将公司分为“必需消费/可选消费”两组,比较收入增速、利润率、库存变化、同店销售(如可得)与现金流质量。
2. 结合宏观变量CPI、实际工资、失业率、利率分析各组盈利弹性。
3. 构建“2026 三情景”盈利弹性矩阵:软着陆/再通胀/衰退。
4. 生成 Excel.xlsx`company_metrics`、`macro_series`、`elasticity_matrix`、`portfolio_actions`。
5. 生成 `strategy_note.md`,给出 2026 年行业配置建议与再平衡触发条件。
6. 生成 `sources.md`。
7. 输出目录:`~/Desktop/multica-fin-bench/case-04/`。
Requirements:
1. Split companies into "consumer staples" and "consumer discretionary" groups; compare revenue growth, margins, inventory changes, same-store sales (if available), and cash flow quality.
2. Analyze each group's earnings elasticity relative to macro variables (CPI, real wages, unemployment, interest rates).
3. Build a "2026 three-scenario" earnings elasticity matrix: soft landing / reflation / recession.
4. Generate an Excel file (.xlsx) with sheets: `company_metrics`, `macro_series`, `elasticity_matrix`, `portfolio_actions`.
5. Generate `strategy_note.md` with 2026 sector allocation recommendations and rebalancing trigger conditions.
6. Generate `sources.md`.
7. Output directory: `~/Desktop/multica-fin-bench/case-04/`.
执行要求:必须明确写出每个配置建议背后的可验证指标。
Execution requirements: Each allocation recommendation must explicitly state the verifiable metrics behind it.

View file

@ -1,18 +1,18 @@
请完成“能源价格冲击下的能源与运输行业敏感性分析”(中文输出)。
Complete an "Energy Price Shock Sensitivity Analysis for Energy & Transport Sectors."
样本XOM、CVX、COP、SLB、DAL、UAL、FDX、UPS。
时间范围2023-01-01 到 2025-12-31。
Sample: XOM, CVX, COP, SLB, DAL, UAL, FDX, UPS.
Time range: 2023-01-01 to 2025-12-31.
任务要求:
1. 归纳各公司对油价/燃料成本的敏感方向与经营杠杆来源。
2. 构建三种油价路径2026 年):
- 情景AWTI 均价 60 美元
- 情景BWTI 均价 80 美元
- 情景CWTI 均价 100 美元
3. 估计不同情景下各公司盈利和估值的相对变化方向(可用区间而非点估计,但要说明依据)。
4. 生成 Excel.xlsx`raw_financials`、`oil_scenarios`、`sensitivity_map`、`trade_ideas`。
5. 生成 `hedge_plan.md`,提出至少 2 套对冲或配对交易思路,并说明失效条件。
6. 生成 `sources.md`。
7. 输出到:`~/Desktop/multica-fin-bench/case-05/`。
Requirements:
1. Summarize each company's sensitivity direction to oil/fuel costs and sources of operating leverage.
2. Construct three oil price paths for 2026:
- Scenario A: WTI average $60
- Scenario B: WTI average $80
- Scenario C: WTI average $100
3. Estimate directional changes in earnings and valuation for each company under different scenarios (ranges are acceptable over point estimates, but rationale must be provided).
4. Generate an Excel file (.xlsx) with sheets: `raw_financials`, `oil_scenarios`, `sensitivity_map`, `trade_ideas`.
5. Generate `hedge_plan.md` proposing at least 2 hedging or paired trade strategies, including conditions under which they would fail.
6. Generate `sources.md`.
7. Output to: `~/Desktop/multica-fin-bench/case-05/`.
执行要求:结论必须包含“基准仓位 + 对冲仓位 + 触发阈值”。
Execution requirements: Conclusions must include "base position + hedge position + trigger thresholds."

View file

@ -1,17 +1,17 @@
请做“跨资产战术配置2026”项目中文输出
Build a "Cross-Asset Tactical Allocation (2026)" project.
资产池SPY、QQQ、IWM、TLT、IEF、HYG、GLD、DBC、BTC-USD。
历史区间2021-01-01 到 2025-12-31月频即可
Asset universe: SPY, QQQ, IWM, TLT, IEF, HYG, GLD, DBC, BTC-USD.
Historical period: 2021-01-01 to 2025-12-31 (monthly frequency is sufficient).
任务要求:
1. 计算并比较关键指标年化收益、波动率、最大回撤、Sharpe、相关性矩阵。
2. 设计两种组合:
- 防守型(目标最大回撤尽量低)
- 进攻型(目标风险调整后收益更高)
3. 对两种组合做 2026 三情景压力测试(增长放缓/通胀反复/流动性宽松),给出调仓规则。
4. 生成 Excel.xlsx`price_returns`、`risk_metrics`、`corr_matrix`、`portfolio_defensive`、`portfolio_offensive`、`scenario_test`。
5. 生成 `allocation_memo.md`,说明为什么这两种组合在 2026 年有可执行性。
6. 生成 `sources.md`。
7. 输出目录:`~/Desktop/multica-fin-bench/case-06/`。
Requirements:
1. Calculate and compare key metrics: annualized return, volatility, maximum drawdown, Sharpe ratio, and correlation matrix.
2. Design two portfolios:
- Defensive (target: minimize maximum drawdown)
- Offensive (target: higher risk-adjusted returns)
3. Stress test both portfolios under three 2026 scenarios (growth slowdown / inflation resurgence / liquidity easing), and provide rebalancing rules.
4. Generate an Excel file (.xlsx) with sheets: `price_returns`, `risk_metrics`, `corr_matrix`, `portfolio_defensive`, `portfolio_offensive`, `scenario_test`.
5. Generate `allocation_memo.md` explaining why these two portfolios are actionable in 2026.
6. Generate `sources.md`.
7. Output directory: `~/Desktop/multica-fin-bench/case-06/`.
执行要求:明确列出每个组合的再平衡频率、止损规则和再入场条件。
Execution requirements: Explicitly state rebalancing frequency, stop-loss rules, and re-entry conditions for each portfolio.

View file

@ -1,14 +1,14 @@
请做“高利率环境下 REIT 投资筛选”任务(中文输出)。
Perform a "REIT Investment Screening in a High-Rate Environment" task.
样本VNQ、PLD、AMT、EQIX、O、SPG、PSA、DLR。
Sample: VNQ, PLD, AMT, EQIX, O, SPG, PSA, DLR.
任务要求:
1. 整理最近三个完整财年的关键指标FFO/AFFO 增速、杠杆、利息覆盖、债务到期结构、分红覆盖。
2. 设计 2026 年三种利率情景10Y 美债收益率 3.5% / 4.5% / 5.5%),分析估值压力与分红可持续性。
3. 给出“持有/观察/回避”分类,并解释最关键的 2-3 个驱动因子。
4. 生成 Excel.xlsx`reit_raw`、`debt_profile`、`rate_scenarios`、`selection_result`。
5. 生成 `reit_investment_note.md`。
6. 生成 `sources.md`。
7. 输出到:`~/Desktop/multica-fin-bench/case-07/`。
Requirements:
1. Compile key metrics from the most recent three complete fiscal years: FFO/AFFO growth, leverage, interest coverage, debt maturity profile, and dividend coverage.
2. Design three 2026 interest rate scenarios (10Y Treasury yield at 3.5% / 4.5% / 5.5%) and analyze valuation pressure and dividend sustainability.
3. Classify each as "hold / watchlist / avoid" and explain the 2-3 most critical driving factors.
4. Generate an Excel file (.xlsx) with sheets: `reit_raw`, `debt_profile`, `rate_scenarios`, `selection_result`.
5. Generate `reit_investment_note.md`.
6. Generate `sources.md`.
7. Output to: `~/Desktop/multica-fin-bench/case-07/`.
执行要求:如果数据缺失,必须在表中显式标注 NA不允许静默跳过。
Execution requirements: If data is missing, it must be explicitly marked as NA in the tables; silent omission is not allowed.

View file

@ -1,14 +1,14 @@
请做“财报质量法证分析”(中文输出)。
Perform an "Earnings Quality Forensic Analysis."
样本AAPL、MSFT、GOOGL、AMZN、META、NVDA、TSLA、BRK.B、UNH、JPM。
时间范围2023-01-01 到 2025-12-31。
Sample: AAPL, MSFT, GOOGL, AMZN, META, NVDA, TSLA, BRK.B, UNH, JPM.
Time range: 2023-01-01 to 2025-12-31.
任务要求:
1. 建立财报质量检查框架,至少包含:应计项目质量、经营现金流与净利润匹配度、股权激励稀释、回购与债务关系、一次性项目影响。
2. 对每家公司给出 Red/Yellow/Green 评级,并给出可追踪的证据。
3. 生成 Excel.xlsx`quality_raw`、`forensic_flags`、`rating_summary`、`watchlist_2026`。
4. 生成 `forensic_report.md`,总结最值得警惕的 5 个红旗。
5. 生成 `sources.md`。
6. 输出目录:`~/Desktop/multica-fin-bench/case-08/`。
Requirements:
1. Establish an earnings quality inspection framework covering at minimum: accruals quality, operating cash flow to net income matching, stock-based compensation dilution, buyback-to-debt relationship, and one-time item impact.
2. Assign each company a Red / Yellow / Green rating with traceable supporting evidence.
3. Generate an Excel file (.xlsx) with sheets: `quality_raw`, `forensic_flags`, `rating_summary`, `watchlist_2026`.
4. Generate `forensic_report.md` summarizing the 5 most concerning red flags.
5. Generate `sources.md`.
6. Output directory: `~/Desktop/multica-fin-bench/case-08/`.
执行要求:必须在报告中明确“哪些结论是事实、哪些是推断”。
Execution requirements: The report must clearly distinguish "which conclusions are factual vs. which are inferred."

View file

@ -1,15 +1,15 @@
请做“财报后漂移PEAD策略可行性研究”中文输出
Perform a "Post-Earnings Announcement Drift (PEAD) Strategy Feasibility Study."
研究区间2023-01-01 到 2025-12-31。
样本:请选取至少 30 只美股大中盘(给出选择标准)。
Research period: 2023-01-01 to 2025-12-31.
Sample: Select at least 30 US large/mid-cap stocks (provide selection criteria).
任务要求:
1. 定义一个可执行的 PEAD 信号(例如财报后 1-3 天信息、盈利超预期代理变量或公告后动量代理变量),并说明局限性。
2. 对样本进行分组比较(高信号/低信号),分析随后 1 个月与 3 个月表现差异。
3. 加入基础风控规则(仓位上限、止损、行业暴露限制),评估策略在 2026 年是否值得小规模试运行。
4. 生成 Excel.xlsx`universe`、`signal_definition`、`group_performance`、`risk_controls`、`pilot_plan_2026`。
5. 生成 `pead_study.md`(包含方法、结果、偏差来源、落地建议)。
6. 生成 `sources.md`。
7. 输出到:`~/Desktop/multica-fin-bench/case-09/`。
Requirements:
1. Define an executable PEAD signal (e.g., post-earnings 1-3 day information, earnings surprise proxy, or post-announcement momentum proxy) and explain its limitations.
2. Group the sample (high signal / low signal) and analyze performance differences at 1-month and 3-month horizons.
3. Add basic risk controls (position limits, stop-loss, sector exposure limits) and evaluate whether the strategy warrants a small-scale pilot in 2026.
4. Generate an Excel file (.xlsx) with sheets: `universe`, `signal_definition`, `group_performance`, `risk_controls`, `pilot_plan_2026`.
5. Generate `pead_study.md` (covering methodology, results, sources of bias, and implementation recommendations).
6. Generate `sources.md`.
7. Output to: `~/Desktop/multica-fin-bench/case-09/`.
执行要求:必须给出“失败场景”与“停止试运行”的客观条件。
Execution requirements: Must provide "failure scenarios" and objective conditions for "stopping the pilot."

View file

@ -1,13 +1,13 @@
请制作“2026 年 Q2 投资委员会材料包”(中文输出)。
Produce a "Q2 2026 Investment Committee Materials Pack."
目标:面向一个美元多资产组合,形成可直接开会使用的投委会文件。
Objective: Create meeting-ready investment committee documents for a USD multi-asset portfolio.
任务要求:
1. 输出一个总览文件 `committee_pack.md`,结构至少包括:宏观判断、权益、利率、信用、商品、组合风险、执行清单。
2. 输出一个 Excel.xlsx工作簿至少包含`macro_dashboard`、`equity_watchlist`、`rates_credit`、`commodity_view`、`portfolio_risk`、`action_tracker`。
3. 在 `action_tracker` 中给出 2026 年 Q2 可执行动作,每条动作要有:触发条件、目标仓位变化、风控阈值、复盘时间点。
4. 额外输出 `devil_advocate.md`,专门反驳你自己的核心投资观点,至少给 5 条反方论据。
5. 额外输出 `sources.md`,列明关键数据来源与日期。
6. 所有文件放到新目录:`~/Desktop/multica-fin-bench/case-10/`。
Requirements:
1. Output a summary document `committee_pack.md` with at least the following sections: macro outlook, equities, rates, credit, commodities, portfolio risk, and action list.
2. Output an Excel workbook (.xlsx) with at least these sheets: `macro_dashboard`, `equity_watchlist`, `rates_credit`, `commodity_view`, `portfolio_risk`, `action_tracker`.
3. In `action_tracker`, provide actionable items for Q2 2026, each with: trigger condition, target position change, risk control threshold, and review date.
4. Additionally output `devil_advocate.md`, specifically rebutting your own core investment views with at least 5 counter-arguments.
5. Additionally output `sources.md` listing key data sources and dates.
6. Place all files in a new directory: `~/Desktop/multica-fin-bench/case-10/`.
执行要求:先列计划再执行;最后给“投委会 10 分钟口头汇报提纲”。
Execution requirements: Plan first, then execute; conclude with a "10-minute oral briefing outline for the investment committee."