Skip to content

Campaign Configuration

Campaign configuration files define multi-model, multi-engine benchmark runs. KITT expands the campaign into a matrix of models x engines x suites and executes each combination. Files live in configs/campaigns/.

Schema

Field Type Required Description
campaign_name str Yes Unique campaign identifier
description str No Human-readable description
schedule str No Cron expression for scheduled runs
auto_compare bool No Compare with previous run on completion
models list[Model] Yes Models to benchmark
engines list[Engine] Yes Engines to test against
disk DiskConfig No Disk space management
notifications NotifyConfig No Notification settings
quant_filter QuantFilter No Quantization file filters
resource_limits ResourceLimits No Skip models exceeding limits
parallel bool No Run combinations in parallel (default: false)
devon_managed bool No Use Devon for model downloads (default: false)

Model entry

Field Type Description
name str Model display name
params str Parameter count label (e.g. "8B")
safetensors_repo str HuggingFace repo for safetensors format
gguf_repo str HuggingFace repo for GGUF format
ollama_tag str Ollama model tag
estimated_size_gb float Approximate disk footprint

Engine entry

Field Type Description
name str Registered engine name
suite str Suite to run for this engine
formats list[str] Accepted model formats (safetensors, gguf)
config dict Engine-specific parameter overrides
quant_filter str Specific quantization to select

Example

campaign_name: example-campaign
description: "Test 2 models across llama.cpp and Ollama"

models:
  - name: Llama-3.1-8B-Instruct
    params: "8B"
    safetensors_repo: meta-llama/Llama-3.1-8B-Instruct
    gguf_repo: bartowski/Meta-Llama-3.1-8B-Instruct-GGUF
    ollama_tag: "llama3.1:8b"
    estimated_size_gb: 16.0

  - name: Qwen2.5-7B-Instruct
    params: "7B"
    safetensors_repo: Qwen/Qwen2.5-7B-Instruct
    gguf_repo: Qwen/Qwen2.5-7B-Instruct-GGUF
    ollama_tag: "qwen2.5:7b"
    estimated_size_gb: 14.0

engines:
  - name: llama_cpp
    suite: standard
    formats: [gguf]
    config: {}

  - name: ollama
    suite: standard
    formats: [gguf]
    config: {}

disk:
  reserve_gb: 100.0
  cleanup_after_run: true

notifications:
  desktop: true
  on_complete: true
  on_failure: true

quant_filter:
  skip_patterns:
    - "IQ1_*"
    - "IQ2_*"
    - "Q4_0_4_4"
    - "Q4_0_4_8"
    - "Q4_0_8_8"

parallel: false
devon_managed: true

Matrix expansion

Given 2 models and 2 engines, KITT creates 4 benchmark runs (one per combination). Each run uses the suite specified in the engine entry. Models are matched to engines based on the formats field -- for example, a model without a gguf_repo will be skipped for engines that require the gguf format.

Scheduled campaigns

Add a schedule field with a cron expression to run the campaign automatically:

schedule: "0 2 * * *"   # 2 AM daily
auto_compare: true