Commit graph

5 commits

Author SHA1 Message Date
Jiayuan Zhang
63e7541149 chore(benchmark): remove hardcoded output directories from case prompts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 11:17:24 +08:00
Jiayuan Zhang
486b090577 chore(benchmark): translate finance e2e case prompts to English
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 02:39:35 +08:00
Jiayuan Zhang
c38a576b8f feat(benchmark): parallelize finance e2e runs 2026-02-16 02:15:55 +08:00
Jiayuan Zhang
b897719775 fix(benchmark): support long-horizon timeout control 2026-02-16 01:54:20 +08:00
Jiayuan Zhang
edc55390cf feat(benchmark): add finance e2e case suite 2026-02-16 01:29:24 +08:00