Skip to main content

Claude Code Integration

CROAK integrates natively with Claude Code through slash commands. This is the recommended way to use CROAK for an interactive, guided experience.

Setup

  1. Initialize your project — this automatically sets up Claude Code integration:

    npx croak-cv init
  2. Open your project in Claude Code (VS Code with Claude extension, or Claude Code CLI)

  3. Start with the Router — type /croak-router to get guidance on next steps

When you run croak init, CROAK creates:

  • .claude/skills/croak-*/SKILL.md — Skill files for each agent
  • CLAUDE.md — Project context file that Claude Code reads automatically

Claude Code discovers these files and makes them available as slash commands.

Agent Commands

Each command activates a specialized AI persona with domain expertise, guardrails, and a knowledge base.

CommandAgentWhat It Does
/croak-routerDispatcherStart here. Pipeline coordinator that guides you through the workflow
/croak-dataScoutScan directories, validate images, manage vfrog SSAT or classic annotations
/croak-trainingCoachConfigure training across local GPU, Modal, or vfrog platform
/croak-evaluationJudgeEvaluate models, analyze errors, generate reports
/croak-deploymentShipperDeploy to vfrog inference, Modal serverless, or edge devices

Workflow Commands

End-to-end pipelines that chain multiple agent steps together.

CommandDescription
/croak-data-preparationFull data pipeline: scan, validate, annotate, split, export
/croak-model-trainingTraining pipeline: recommend, configure, execute, handoff
/croak-model-evaluationEvaluation pipeline: evaluate, analyze, diagnose, report
/croak-model-deploymentDeployment pipeline: export, optimize, deploy, verify

Example Session

You: /croak-router

Claude: Dispatcher here! I see this is a new CROAK project.
Current stage: uninitialized

Let me help you get started. Do you have images ready to train on?

You: Yes, I have 500 product images in ~/photos/products

Claude: Great! Let me hand you off to Scout (Data Agent) to scan
and validate them.

You: /croak-data

Claude: Scout reporting for duty! I'll help you prepare your dataset.
Let me scan ~/photos/products...
[Runs: croak scan ~/photos/products]

Found 500 images. 487 valid, 13 have issues...

Agent Details

Dispatcher (Router)

The pipeline coordinator. Tracks your progress through the workflow stages, recommends next actions, and routes you to the right specialist agent.

Commands: status, init, reset, next, help, vfrog setup

Scout (Data)

Data quality specialist. Scans directories for images, validates annotations, checks class balance, and manages the annotation workflow — either vfrog SSAT or classic import.

Commands: scan, validate, convert, split, annotate, stats, visualize

Quality guardrails:

  • Minimum 100 images, 50+ per class
  • Maximum 10:1 class imbalance
  • Annotation coverage checks
  • Corrupt image detection

Coach (Training)

Training specialist. Recommends model architectures based on your dataset, estimates training cost and time, configures hyperparameters, and manages experiment tracking via MLflow or Weights & Biases.

Supported architectures: YOLOv8, YOLOv11, RT-DETR

Supported providers: Local GPU, Modal.com, vfrog platform

Judge (Evaluation)

Evaluation specialist. Calculates metrics (mAP, precision, recall, F1), performs error analysis, identifies failure patterns, and generates detailed performance reports.

Metrics: mAP@50, mAP@50-95, per-class precision/recall, confusion matrices

Shipper (Deployment)

Deployment specialist. Exports models to optimized formats, handles quantization, and deploys to your target environment.

Export formats: ONNX, TensorRT, CoreML, TFLite, TorchScript

Deploy targets: vfrog inference API, Modal serverless, edge devices

Agent Handoffs

Agents communicate through validated handoff contracts. When one agent completes its work, it produces a structured artifact that the next agent can pick up. For example:

  1. Scout validates the dataset and produces a DatasetArtifact
  2. Coach receives the artifact, trains, and produces a ModelArtifact
  3. Judge evaluates the model and produces an evaluation report
  4. Shipper deploys using the model and evaluation results

Handoff files are stored in .croak/handoffs/ and can be inspected for debugging.

Next Steps