Agent Benchmarking

Elevated · Review

Head-to-head coding agent comparison tool: YAML task definitions with judge criteria, git worktree isolation per agent run, pass rate/cost/time/consistency metrics, and reproducible benchmarking across Claude Code, Aider, Codex, and other agents.

Governance Receipt

Signer: sovereign-claw-ed25519
Signed At: 6/4/2026
Risk Tier: T2
Receipt Hash: cee94f86
Manifest Hash: 9bb29a8b037ffd9dd19de60289d7292858c4c5cc8688632eaa09e9be2e9c7d25
Signature: s/aDylbU
Root Public Key: 349b0348

Skill Details

Gate Verdict: Elevated · Review
Publication State: published
Risk Tier: T2
Manifest Hash: 9bb29a8b

Download Attested Bundle

Claude Code Skill MCP Tools Raw Bundle

Attested Source Repository Original Source

More Skills

MCP Builder

Elevated · Review

Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).

Theme Factory

Elevated · Review

Toolkit for styling artifacts with a theme. These artifacts can be slides, docs, reportings, HTML landing pages, etc. There are 10 pre-set themes with colors/fonts that you can apply to any artifact that has been creating, or can generate a new theme on-the-fly.

Claude API

Elevated · Review

Reference for the Claude API / Anthropic SDK — model ids, pricing, params, streaming, tool use, MCP, agents, caching, token counting, model migration. TRIGGER — read BEFORE opening the target file; don't skip because it "looks like a one-liner" — whenever: the prompt names Claude/Anthropic in any form (Claude, Anthropic, Opus, Sonnet, Haiku, `anthropic`, `@anthropic-ai`, `claude-*`, `us.anthropic.*`, `[1m]`); the user asks about an LLM (pricing/model choice/limits/caching) — never answer from memory; OR the task is LLM-shaped with provider unstated (agent/MCP/tool-definition/multi-agent/RAG…