multica

marketing-shibata50/multica

Fork 0

Commit graph

Author	SHA1	Message	Date
Jiayuan Zhang	10c57c0f7a	docs: add SWE-bench runner guide Covers the full pipeline: dataset download, agent execution, result analysis, and official Docker evaluation. Includes runner options, output format, known limitations, and initial benchmark results. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 18:30:58 +08:00

Author

SHA1

Message

Date

Jiayuan Zhang

10c57c0f7a

docs: add SWE-bench runner guide

Covers the full pipeline: dataset download, agent execution,
result analysis, and official Docker evaluation. Includes
runner options, output format, known limitations, and initial
benchmark results.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-15 18:30:58 +08:00

1 commit