Agent Benchmarking
Unverified
Head-to-head coding agent comparison tool: YAML task definitions with judge criteria, git worktree isolation per agent run, pass rate/cost/time/consistency metrics, and reproducible benchmarking across Claude Code, Aider, Codex, and other agents.
Skill Details
- Gate Verdict
- Unverified
- Publication State
- published
- Risk Tier
- low
- Manifest Hash
- 4b623ffa
More Skills
dmux Multi-Agent Workflows
Unverified
Multi-agent orchestration using dmux tmux pane manager: parallel agent workflows across Claude Code, Codex, OpenCode, and other harnesses.
Agent Team Builder
Unverified
Interactive agent picker for composing and dispatching parallel teams. Select agents by capability, assign tasks, and coordinate execution.
Next.js Turbopack
Unverified
Next.js 16+ and Turbopack: incremental bundling, FS caching, dev speed improvements, and migration from webpack.