- download-dataset.py: fetches SWE-bench Lite/Verified/Full from HuggingFace - run.ts: core runner that clones repos, runs Agent, collects git diff patches - evaluate.sh: wrapper for official SWE-bench Docker evaluation harness - analyze.ts: summarizes run results with per-repo and timing breakdowns Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| swe-bench | ||
| archive-dev-data.sh | ||
| build-cli.js | ||
| dev-local.sh | ||
| generate-code-stats-report.sh | ||
| reset-user-data.sh | ||
| set-telegram-webhook.sh | ||