multica

History

Jiayuan Zhang 90d374ffd5 feat(scripts): add SWE-bench runner for Multica agent evaluation - download-dataset.py: fetches SWE-bench Lite/Verified/Full from HuggingFace - run.ts: core runner that clones repos, runs Agent, collects git diff patches - evaluate.sh: wrapper for official SWE-bench Docker evaluation harness - analyze.ts: summarizes run results with per-repo and timing breakdowns Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-02-15 18:05:17 +08:00
..
swe-bench	feat(scripts): add SWE-bench runner for Multica agent evaluation	2026-02-15 18:05:17 +08:00
archive-dev-data.sh	feat(scripts): add dev:local:archive to snapshot dev data for debugging	2026-02-15 00:58:39 +08:00
build-cli.js	chore(cli): update package.json and build script for unified CLI	2026-02-01 23:09:54 +08:00
dev-local.sh	refactor: unify API URL env var to MULTICA_API_URL	2026-02-15 06:31:00 +08:00
generate-code-stats-report.sh	feat(report): add code stats report generator	2026-02-15 04:32:30 +08:00
reset-user-data.sh	chore: update reset scripts and docs for dev data directory	2026-02-15 00:39:25 +08:00
set-telegram-webhook.sh	chore(telegram): add webhook setup script	2026-02-10 17:07:45 +08:00