chore(benchmark): remove hardcoded output directories from case prompts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Jiayuan Zhang 2026-02-16 11:17:24 +08:00
parent 357bf326e0
commit 63e7541149
10 changed files with 1 additions and 11 deletions

View file

@ -8,7 +8,6 @@ Requirements:
3. Generate an Excel file (.xlsx) with at least 4 sheets: `raw_data`, `company_scorecard`, `valuation`, `risk_matrix`.
4. Generate a comprehensive report with cross-company comparison and tiering (core holding / watchlist / avoid), along with 2026 portfolio recommendations (including position ranges and trigger conditions).
5. Output a separate `sources.md` listing key data source links and retrieval timestamps.
6. Place all output in a new directory: `~/Desktop/multica-fin-bench/case-01/`.
7. If unable to generate xlsx directly, explain why and provide structurally equivalent CSV files.
6. If unable to generate xlsx directly, explain why and provide structurally equivalent CSV files.
Execution requirements: First present an 8-12 step execution plan, then execute. Conclude with a self-check checklist confirming all files are complete.

View file

@ -10,6 +10,5 @@ Requirements:
4. In `scenario_2026`, provide target ranges and trigger signals under three scenarios (optimistic / base / conservative).
5. Produce `investment_memo.md` (including entry logic for the top 3 and avoidance logic for the bottom 3).
6. Produce `sources.md` (source links + dates).
7. Output directory: `~/Desktop/multica-fin-bench/case-02/`.
Execution requirements: Plan before executing; conclude with a "reproducibility check" (can someone else reproduce your results following your steps).

View file

@ -11,6 +11,5 @@ Requirements:
4. Generate an Excel file (.xlsx) with sheets: `bank_raw`, `stress_assumptions`, `impact_estimate`, `ranking`.
5. Generate `risk_brief.md` containing "top 5 risk signals to watch."
6. Generate `sources.md`.
7. Output to: `~/Desktop/multica-fin-bench/case-03/`.
Execution requirements: Present methodology first, then results; conclude by listing the 3 assumptions you are least confident about.

View file

@ -10,6 +10,5 @@ Requirements:
4. Generate an Excel file (.xlsx) with sheets: `company_metrics`, `macro_series`, `elasticity_matrix`, `portfolio_actions`.
5. Generate `strategy_note.md` with 2026 sector allocation recommendations and rebalancing trigger conditions.
6. Generate `sources.md`.
7. Output directory: `~/Desktop/multica-fin-bench/case-04/`.
Execution requirements: Each allocation recommendation must explicitly state the verifiable metrics behind it.

View file

@ -13,6 +13,5 @@ Requirements:
4. Generate an Excel file (.xlsx) with sheets: `raw_financials`, `oil_scenarios`, `sensitivity_map`, `trade_ideas`.
5. Generate `hedge_plan.md` proposing at least 2 hedging or paired trade strategies, including conditions under which they would fail.
6. Generate `sources.md`.
7. Output to: `~/Desktop/multica-fin-bench/case-05/`.
Execution requirements: Conclusions must include "base position + hedge position + trigger thresholds."

View file

@ -12,6 +12,5 @@ Requirements:
4. Generate an Excel file (.xlsx) with sheets: `price_returns`, `risk_metrics`, `corr_matrix`, `portfolio_defensive`, `portfolio_offensive`, `scenario_test`.
5. Generate `allocation_memo.md` explaining why these two portfolios are actionable in 2026.
6. Generate `sources.md`.
7. Output directory: `~/Desktop/multica-fin-bench/case-06/`.
Execution requirements: Explicitly state rebalancing frequency, stop-loss rules, and re-entry conditions for each portfolio.

View file

@ -9,6 +9,5 @@ Requirements:
4. Generate an Excel file (.xlsx) with sheets: `reit_raw`, `debt_profile`, `rate_scenarios`, `selection_result`.
5. Generate `reit_investment_note.md`.
6. Generate `sources.md`.
7. Output to: `~/Desktop/multica-fin-bench/case-07/`.
Execution requirements: If data is missing, it must be explicitly marked as NA in the tables; silent omission is not allowed.

View file

@ -9,6 +9,5 @@ Requirements:
3. Generate an Excel file (.xlsx) with sheets: `quality_raw`, `forensic_flags`, `rating_summary`, `watchlist_2026`.
4. Generate `forensic_report.md` summarizing the 5 most concerning red flags.
5. Generate `sources.md`.
6. Output directory: `~/Desktop/multica-fin-bench/case-08/`.
Execution requirements: The report must clearly distinguish "which conclusions are factual vs. which are inferred."

View file

@ -10,6 +10,5 @@ Requirements:
4. Generate an Excel file (.xlsx) with sheets: `universe`, `signal_definition`, `group_performance`, `risk_controls`, `pilot_plan_2026`.
5. Generate `pead_study.md` (covering methodology, results, sources of bias, and implementation recommendations).
6. Generate `sources.md`.
7. Output to: `~/Desktop/multica-fin-bench/case-09/`.
Execution requirements: Must provide "failure scenarios" and objective conditions for "stopping the pilot."

View file

@ -8,6 +8,5 @@ Requirements:
3. In `action_tracker`, provide actionable items for Q2 2026, each with: trigger condition, target position change, risk control threshold, and review date.
4. Additionally output `devil_advocate.md`, specifically rebutting your own core investment views with at least 5 counter-arguments.
5. Additionally output `sources.md` listing key data sources and dates.
6. Place all files in a new directory: `~/Desktop/multica-fin-bench/case-10/`.
Execution requirements: Plan first, then execute; conclude with a "10-minute oral briefing outline for the investment committee."