diff --git a/scripts/e2e-finance-benchmark/cases/case-01-top10-financial-reports.txt b/scripts/e2e-finance-benchmark/cases/case-01-top10-financial-reports.txt index 9c75bcce..4ace6554 100644 --- a/scripts/e2e-finance-benchmark/cases/case-01-top10-financial-reports.txt +++ b/scripts/e2e-finance-benchmark/cases/case-01-top10-financial-reports.txt @@ -8,7 +8,6 @@ Requirements: 3. Generate an Excel file (.xlsx) with at least 4 sheets: `raw_data`, `company_scorecard`, `valuation`, `risk_matrix`. 4. Generate a comprehensive report with cross-company comparison and tiering (core holding / watchlist / avoid), along with 2026 portfolio recommendations (including position ranges and trigger conditions). 5. Output a separate `sources.md` listing key data source links and retrieval timestamps. -6. Place all output in a new directory: `~/Desktop/multica-fin-bench/case-01/`. -7. If unable to generate xlsx directly, explain why and provide structurally equivalent CSV files. +6. If unable to generate xlsx directly, explain why and provide structurally equivalent CSV files. Execution requirements: First present an 8-12 step execution plan, then execute. Conclude with a self-check checklist confirming all files are complete. diff --git a/scripts/e2e-finance-benchmark/cases/case-02-ai-value-chain-scorecard.txt b/scripts/e2e-finance-benchmark/cases/case-02-ai-value-chain-scorecard.txt index e3b9cc34..c4ab9df8 100644 --- a/scripts/e2e-finance-benchmark/cases/case-02-ai-value-chain-scorecard.txt +++ b/scripts/e2e-finance-benchmark/cases/case-02-ai-value-chain-scorecard.txt @@ -10,6 +10,5 @@ Requirements: 4. In `scenario_2026`, provide target ranges and trigger signals under three scenarios (optimistic / base / conservative). 5. Produce `investment_memo.md` (including entry logic for the top 3 and avoidance logic for the bottom 3). 6. Produce `sources.md` (source links + dates). -7. Output directory: `~/Desktop/multica-fin-bench/case-02/`. Execution requirements: Plan before executing; conclude with a "reproducibility check" (can someone else reproduce your results following your steps). diff --git a/scripts/e2e-finance-benchmark/cases/case-03-us-bank-stress-test.txt b/scripts/e2e-finance-benchmark/cases/case-03-us-bank-stress-test.txt index a9439320..d7960d62 100644 --- a/scripts/e2e-finance-benchmark/cases/case-03-us-bank-stress-test.txt +++ b/scripts/e2e-finance-benchmark/cases/case-03-us-bank-stress-test.txt @@ -11,6 +11,5 @@ Requirements: 4. Generate an Excel file (.xlsx) with sheets: `bank_raw`, `stress_assumptions`, `impact_estimate`, `ranking`. 5. Generate `risk_brief.md` containing "top 5 risk signals to watch." 6. Generate `sources.md`. -7. Output to: `~/Desktop/multica-fin-bench/case-03/`. Execution requirements: Present methodology first, then results; conclude by listing the 3 assumptions you are least confident about. diff --git a/scripts/e2e-finance-benchmark/cases/case-04-consumer-sector-macro-linkage.txt b/scripts/e2e-finance-benchmark/cases/case-04-consumer-sector-macro-linkage.txt index c0b3c9dc..8034fb7d 100644 --- a/scripts/e2e-finance-benchmark/cases/case-04-consumer-sector-macro-linkage.txt +++ b/scripts/e2e-finance-benchmark/cases/case-04-consumer-sector-macro-linkage.txt @@ -10,6 +10,5 @@ Requirements: 4. Generate an Excel file (.xlsx) with sheets: `company_metrics`, `macro_series`, `elasticity_matrix`, `portfolio_actions`. 5. Generate `strategy_note.md` with 2026 sector allocation recommendations and rebalancing trigger conditions. 6. Generate `sources.md`. -7. Output directory: `~/Desktop/multica-fin-bench/case-04/`. Execution requirements: Each allocation recommendation must explicitly state the verifiable metrics behind it. diff --git a/scripts/e2e-finance-benchmark/cases/case-05-energy-transport-sensitivity.txt b/scripts/e2e-finance-benchmark/cases/case-05-energy-transport-sensitivity.txt index 37993e7a..30b33398 100644 --- a/scripts/e2e-finance-benchmark/cases/case-05-energy-transport-sensitivity.txt +++ b/scripts/e2e-finance-benchmark/cases/case-05-energy-transport-sensitivity.txt @@ -13,6 +13,5 @@ Requirements: 4. Generate an Excel file (.xlsx) with sheets: `raw_financials`, `oil_scenarios`, `sensitivity_map`, `trade_ideas`. 5. Generate `hedge_plan.md` proposing at least 2 hedging or paired trade strategies, including conditions under which they would fail. 6. Generate `sources.md`. -7. Output to: `~/Desktop/multica-fin-bench/case-05/`. Execution requirements: Conclusions must include "base position + hedge position + trigger thresholds." diff --git a/scripts/e2e-finance-benchmark/cases/case-06-cross-asset-allocation.txt b/scripts/e2e-finance-benchmark/cases/case-06-cross-asset-allocation.txt index 19f9c5fa..d4557551 100644 --- a/scripts/e2e-finance-benchmark/cases/case-06-cross-asset-allocation.txt +++ b/scripts/e2e-finance-benchmark/cases/case-06-cross-asset-allocation.txt @@ -12,6 +12,5 @@ Requirements: 4. Generate an Excel file (.xlsx) with sheets: `price_returns`, `risk_metrics`, `corr_matrix`, `portfolio_defensive`, `portfolio_offensive`, `scenario_test`. 5. Generate `allocation_memo.md` explaining why these two portfolios are actionable in 2026. 6. Generate `sources.md`. -7. Output directory: `~/Desktop/multica-fin-bench/case-06/`. Execution requirements: Explicitly state rebalancing frequency, stop-loss rules, and re-entry conditions for each portfolio. diff --git a/scripts/e2e-finance-benchmark/cases/case-07-reit-rate-risk.txt b/scripts/e2e-finance-benchmark/cases/case-07-reit-rate-risk.txt index 1a5bedf6..17a02303 100644 --- a/scripts/e2e-finance-benchmark/cases/case-07-reit-rate-risk.txt +++ b/scripts/e2e-finance-benchmark/cases/case-07-reit-rate-risk.txt @@ -9,6 +9,5 @@ Requirements: 4. Generate an Excel file (.xlsx) with sheets: `reit_raw`, `debt_profile`, `rate_scenarios`, `selection_result`. 5. Generate `reit_investment_note.md`. 6. Generate `sources.md`. -7. Output to: `~/Desktop/multica-fin-bench/case-07/`. Execution requirements: If data is missing, it must be explicitly marked as NA in the tables; silent omission is not allowed. diff --git a/scripts/e2e-finance-benchmark/cases/case-08-earnings-quality-forensics.txt b/scripts/e2e-finance-benchmark/cases/case-08-earnings-quality-forensics.txt index 15c2f8b0..fa9b1cc5 100644 --- a/scripts/e2e-finance-benchmark/cases/case-08-earnings-quality-forensics.txt +++ b/scripts/e2e-finance-benchmark/cases/case-08-earnings-quality-forensics.txt @@ -9,6 +9,5 @@ Requirements: 3. Generate an Excel file (.xlsx) with sheets: `quality_raw`, `forensic_flags`, `rating_summary`, `watchlist_2026`. 4. Generate `forensic_report.md` summarizing the 5 most concerning red flags. 5. Generate `sources.md`. -6. Output directory: `~/Desktop/multica-fin-bench/case-08/`. Execution requirements: The report must clearly distinguish "which conclusions are factual vs. which are inferred." diff --git a/scripts/e2e-finance-benchmark/cases/case-09-post-earnings-drift-study.txt b/scripts/e2e-finance-benchmark/cases/case-09-post-earnings-drift-study.txt index d8445072..4830f0f7 100644 --- a/scripts/e2e-finance-benchmark/cases/case-09-post-earnings-drift-study.txt +++ b/scripts/e2e-finance-benchmark/cases/case-09-post-earnings-drift-study.txt @@ -10,6 +10,5 @@ Requirements: 4. Generate an Excel file (.xlsx) with sheets: `universe`, `signal_definition`, `group_performance`, `risk_controls`, `pilot_plan_2026`. 5. Generate `pead_study.md` (covering methodology, results, sources of bias, and implementation recommendations). 6. Generate `sources.md`. -7. Output to: `~/Desktop/multica-fin-bench/case-09/`. Execution requirements: Must provide "failure scenarios" and objective conditions for "stopping the pilot." diff --git a/scripts/e2e-finance-benchmark/cases/case-10-investment-committee-pack.txt b/scripts/e2e-finance-benchmark/cases/case-10-investment-committee-pack.txt index 512a4d8c..83c26176 100644 --- a/scripts/e2e-finance-benchmark/cases/case-10-investment-committee-pack.txt +++ b/scripts/e2e-finance-benchmark/cases/case-10-investment-committee-pack.txt @@ -8,6 +8,5 @@ Requirements: 3. In `action_tracker`, provide actionable items for Q2 2026, each with: trigger condition, target position change, risk control threshold, and review date. 4. Additionally output `devil_advocate.md`, specifically rebutting your own core investment views with at least 5 counter-arguments. 5. Additionally output `sources.md` listing key data sources and dates. -6. Place all files in a new directory: `~/Desktop/multica-fin-bench/case-10/`. Execution requirements: Plan first, then execute; conclude with a "10-minute oral briefing outline for the investment committee."