feat(benchmark): parallelize finance e2e runs

This commit is contained in:
Jiayuan Zhang 2026-02-16 02:15:55 +08:00
parent b897719775
commit c38a576b8f
2 changed files with 125 additions and 68 deletions

View file

@ -32,6 +32,8 @@ scripts/e2e-finance-benchmark/run.sh
The script defaults:
- Providers: `kimi-coding claude-code`
- Case glob: `case-*.txt`
- Max parallel workers: `2`
- Per-case timeout: `900s` (set `CASE_TIMEOUT_SEC=0` to disable)
- Output directory: `.context/finance-e2e-runs/<timestamp>/`
Generated artifact:
@ -51,6 +53,12 @@ Run only specific cases by glob:
CASE_GLOB="case-0[1-3]*.txt" scripts/e2e-finance-benchmark/run.sh
```
Run with higher parallelism for long-horizon tasks:
```bash
MAX_PARALLEL=4 CASE_TIMEOUT_SEC=2700 scripts/e2e-finance-benchmark/run.sh
```
## Case List
1. `case-01-top10-financial-reports.txt`