Installation¶
KITT can run from a pre-built Docker image or be installed from source with Poetry. The Docker method is the fastest way to get started; the source install gives you direct access to the CLI and development tools.
Docker (Primary Method)¶
Build the image and run benchmarks in a single command. No Python environment required on the host.
docker run --rm --network host \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /path/to/models:/models:ro \
-v ./kitt-results:/app/kitt-results \
kitt run -m /models/llama-7b -e vllm
| Mount | Purpose |
|---|---|
/var/run/docker.sock |
Lets KITT manage engine containers from inside its own container |
/path/to/models (read-only) |
Model weights accessible to both KITT and the engine |
./kitt-results |
Benchmark output written back to the host |
Warning
Mounting the Docker socket grants the container full control over Docker on the host. Only use images you trust.
Source Install¶
Prerequisites¶
- Python 3.10+ and Poetry
- Docker for running inference engines
- System build tools for native dependencies:
-
NVIDIA Container Toolkit for GPU support (required by all engines except CPU-only llama.cpp builds):
Install¶
Clone the repository and install with Poetry:
Activate the virtual environment:
Verify the installation:
Optional Extras¶
KITT ships with optional dependency groups for features that not every user needs. Install them individually or pull in everything at once.
# Individual extras
poetry install -E datasets
poetry install -E web
poetry install -E cli_ui
# Everything
poetry install -E all
| Extra | What It Adds | Required For |
|---|---|---|
datasets |
HuggingFace Datasets | Quality benchmarks (MMLU, GSM8K, TruthfulQA, HellaSwag) |
web |
Flask | kitt web dashboard and REST API |
cli_ui |
Textual | kitt compare interactive TUI |
all |
All of the above | Full feature set |
Note
Performance benchmarks (throughput, latency, memory, warmup) have no extra dependencies -- they work with the base install.
Verify GPU Access¶
After installation, confirm that Docker can see your GPU:
Then check KITT's hardware detection:
This prints a full system profile including GPU model, VRAM, CPU, RAM, storage type, CUDA version, and driver version.
Next Steps¶
- Tutorial: First Benchmark -- run an end-to-end test
- Tutorial: Docker Quickstart -- container-based workflows