Skip to content

CI/CD Integration

KITT integrates with CI/CD pipelines to automate benchmark runs, post results as PR comments, and gate deployments on performance thresholds.


CI Report Command

The kitt ci report command generates a summary from benchmark results and optionally posts it to a GitHub pull request:

kitt ci report \
    --results-dir ./benchmark-output \
    --baseline-dir ./baseline-results \
    --github-token "$GITHUB_TOKEN" \
    --repo owner/repo \
    --pr 42
Flag Description
--results-dir Directory containing the latest benchmark metrics.json
--baseline-dir Previous results to compare against (optional)
--github-token GitHub API token for posting comments
--repo Repository in owner/repo format
--pr Pull request number
--output Write the report to a local file instead of posting

When both --results-dir and --baseline-dir are provided, the report includes a comparison showing regressions and improvements.

If a KITT comment already exists on the PR (identified by a hidden HTML marker), it is updated in place rather than creating a duplicate.


GitHub Actions Workflow

Below is a complete workflow that runs benchmarks on every pull request and posts results as a comment:

name: KITT Benchmark

on:
  pull_request:
    branches: [main]

jobs:
  benchmark:
    runs-on: [self-hosted, gpu]
    steps:
      - uses: actions/checkout@v4

      - name: Install KITT
        run: pip install kitt-bench

      - name: Pull engine image
        run: kitt engines setup vllm

      - name: Run benchmarks
        run: |
          kitt run \
            -m ./models/llama-3-8b \
            -e vllm \
            -s quick \
            -o ./results

      - name: Post CI report
        if: github.event_name == 'pull_request'
        run: |
          kitt ci report \
            --results-dir ./results \
            --github-token "${{ secrets.GITHUB_TOKEN }}" \
            --repo "${{ github.repository }}" \
            --pr "${{ github.event.pull_request.number }}"

      - name: Upload artifacts
        uses: actions/upload-artifact@v4
        with:
          name: benchmark-results
          path: ./results/

Using a Baseline

To detect regressions, store baseline results as a build artifact or in a dedicated branch and compare against them:

      - name: Download baseline
        uses: actions/download-artifact@v4
        with:
          name: benchmark-baseline
          path: ./baseline
        continue-on-error: true

      - name: Post report with comparison
        run: |
          kitt ci report \
            --results-dir ./results \
            --baseline-dir ./baseline \
            --github-token "${{ secrets.GITHUB_TOKEN }}" \
            --repo "${{ github.repository }}" \
            --pr "${{ github.event.pull_request.number }}"

Artifact Collection

Benchmark runs produce the following files under the output directory:

File Contents
metrics.json Raw benchmark metrics
summary.md Human-readable Markdown summary
hardware.json Hardware fingerprint of the runner
config.json Configuration used for the run

Upload these as workflow artifacts to preserve a history of benchmark results across builds.


Local Report Generation

Generate a report file without posting to GitHub:

kitt ci report --results-dir ./results --output report.md

This is useful for local review or integration with other reporting systems.


Exit Codes

KITT commands use standard exit codes so CI systems can detect failures:

Code Meaning
0 Success
1 Benchmark failure, missing results, or API error

Use these exit codes to gate merge or deployment steps in your pipeline.