Changelog¶
All notable changes to KITT are documented on this page.
1.6.2¶
- Added real agent campaign dispatch via heartbeat — campaigns on real agents now break into individual quick test rows, queue them one at a time, and poll for completion using the existing heartbeat mechanism
- Campaign executor tracks progress with campaign-level log streaming, handles cancellation, and enforces a 30-minute per-test timeout
1.6.1¶
- Fixed campaign live log streaming stuck on "waiting for logs..." — simulation published events to test ID channels but the detail page subscribed to the campaign ID channel
- Replaced HTMX SSE with JavaScript EventSource + Alpine.js in campaign detail template (matching the quick test pattern)
- Persisted campaign logs to database (
campaign_logstable, schema v10) so page refresh and post-completion views show log history - Added
GET /api/v1/campaigns/<id>/logsendpoint for stored log retrieval - Fixed SSE connection not closing on cancelled status
- Added cancellation check between iterations in campaign simulation loop
1.6.0¶
- Added interactive campaign creation wizard with step-by-step flow: agent selection, engine compatibility filtering with format badges and platform warnings, searchable model multi-select, and compatibility matrix review
- Added virtual test agents for end-to-end UI testing without real GPU hardware — configurable hardware specs, always-online status, TEST badge in agent list
- Test agents simulate quick test and campaign execution with realistic delays, live SSE log streaming, and fake result generation through the normal ResultStore pipeline
- Campaign and quicktest APIs detect test agents and spawn simulation threads instead of dispatching to agents
- Added result generator producing realistic metrics for throughput, latency, memory, and accuracy benchmarks
- Fixed stored XSS in campaign row rendering by escaping user values
- Added thread safety with
db_write_lockon all quicktest DB writes - Added input validation for integer form fields in test agent creation
1.5.0¶
- Added agent-aware engine filtering to quick test UI — engines dynamically populated from API based on selected agent's CPU architecture
- Added
get_engine_compatibility()in image resolver reporting per-engine ARM64/x86_64 compatibility - Added
GET /api/v1/quicktest/agent-capabilitiesendpoint returning per-agent engine compatibility - Added override toggle and
forceflag to bypass both platform and model-format validation - Fixed Devon UI "connecting" stuck state (
htmx:afterSwap→htmx:afterRequest) - ARM64 engine fixes: Docker CLI static binary, GGUF directory resolution, Ollama local import, vLLM NGC prefix detection
1.4.1¶
- Fixed
SCHEMA_VERSIONnot incremented after adding migration v9 (cpu_archcolumn), so the migration never ran on server startup - Added
cpu_archto Postgres fresh-install schema
1.4.0¶
- Added platform-aware image selection for ARM64 and multi-arch support — image resolver considers CPU architecture alongside GPU compute capability
- ARM64 boards (DGX Spark, Jetson Orin) get platform-specific images for engines without multi-arch builds
- Added
--platformflag toDockerManager.pull_image()andbuild_image() - Added
cpu_archfield to agent registration, models, and database (migration v9) - Added
kitt/llama-cpp:arm64build recipe and Dockerfile for ARM64 hosts - Fixed arm64/aarch64 architecture mismatch in agent Docker checks — added
normalize_arch()for consistent naming - Added
kitt remoteCLI commands for managing remote GPU servers via SSH
1.3.0¶
- Fixed end-to-end remote execution on DGX Spark
- Added agent 404 recovery with hostname fallback and canonical ID sync
- Added model format validation preflight checks — engines declare supported formats and KITT blocks incompatible model/engine combinations before container launch
- Added web UI engine/model compatibility filtering in quick test form
- Added configurable engine images via
~/.kitt/engines.yaml - Added
--auto-pullflag tokitt runfor automatic engine image pulling - Added
kitt remote engines setupcommand for remote engine image management
1.2.1¶
- Fixed agent benchmark results never reaching the server —
_execute_test()now readsmetrics.jsonfrom the output directory and forwards it asresult_datain_report() - Fixed
PermissionErrorwhenkitt rundefaults to relativekitt-results/inside a Docker container — agent now passes-owith a writable temp directory tokitt run - Changed default output directory for
kitt runfrom relativekitt-results/to~/.kitt/results/for robustness across environments - Temp output directories (
/tmp/kitt-results-*) are cleaned up after agent benchmarks complete - Fixed architecture mismatch in agent Docker image selection — now checks image arch against host arch before use, falling back to locally-built
kitt:latestwhen the registry image is the wrong platform - Fixed
_report()using agent name instead of agent ID in URL, causing 404 on result submission - Fixed Docker entrypoint override for benchmark containers — added
--entrypoint kittsince the KITT image hasENTRYPOINT ["kitt", "web"] - Reverted Docker CLI package name in Dockerfiles back to
docker.io—docker-clidoes not exist in Debian bookworm repos - Updated hardcoded
kitt_versionreferences from1.1.0to1.2.1 - Fixed
_check_auth()timing attack — replaced==withhmac.compare_digestfor constant-time token comparison - Fixed thread safety — write lock now wraps entire SQLite transactions (execute + commit), not just the commit
- Fixed
_find_on_share()path traversal — resolved candidates are validated against the share mount root - Fixed result detail 404 —
query()andget_result()now include the database ID in returned results - Fixed
kitt --versionreporting wrong version — now reads fromkitt.__version__dynamically - Fixed cleanup button in agent detail page targeting the wrong endpoint (API instead of blueprint)
- Added
@require_authtoGET /api/v1/agents/<id>/settingsendpoint - Added
kitt_imageto migration v8 defaults for consistency withAgentManager._DEFAULT_SETTINGS - Added CSRF protection (
@csrf_protect) to all state-changing web blueprint endpoints - Added SHA-256 integrity verification for agent package downloads and build context
- Added Docker environment variable redaction in container start logs
- Added rotating file logging to agent (
~/.kitt/logs/agent.log, 5MB per file, 3 backups) - Fixed preflight server reachability check — tries TLS verification first, falls back to insecure for self-signed certs
- Fixed preflight
URLError-wrapped SSL errors bypassing the insecure fallback - Removed unused
verifyandclient_certparameters from_report()function - Fixed cleanup endpoints bypassing
_write_lock— extractedAgentManager.queue_cleanup_command()method - Fixed
register()TOCTOU race — moved SELECT inside write lock to prevent concurrent duplicate inserts - Fixed
_find_on_share()glob results not validated against share root (symlink traversal) - Fixed
_find_on_share()path validation to userelative_to()instead ofstr.startswith()for robustness - Fixed
csrf_protectBearer token exemption — now validates the token before bypassing CSRF check - Fixed hardcoded
kitt_versioninrun.pyandjson_reporter.py— now useskitt.__version__dynamically - Fixed HTMX storage polling in agent detail page targeting JSON endpoint instead of HTML page
- Fixed
tarfile.extractall(filter="data")incompatibility with Python 3.10 — guarded with version check - Removed obsolete
docker/agent/Dockerfilereferencing deleted full agent - Fixed
docker-clireferences in documentation (docs/reference/docker-files.md) - Replaced f-string logger calls with lazy
%sformatting across agent package and server - Refactored API token verification — extracted
check_agent_auth()andcheck_agent_auth_by_name()methods onAgentManager, removing raw_connaccess from API endpoints - Fixed
tarfile.extractall(filter="data")guard to usetry/except TypeErrorinstead of version check, correctly handling Python 3.11.4+ backport - Fixed
docs/reference/api.mdauth column forGET /api/v1/agents/<id>/settings— was incorrectly listed as unauthenticated - Removed stale
docker/agent/Dockerfilerow fromdocs/reference/docker-files.md - Removed dead
docker/agent/Dockerfilereference in stack generator - Fixed remaining f-string logger calls in
migrations.pyandagent_install.py - Updated stale
kitt_versionin test fixtures from1.1.0to1.2.1 - Reverted
docker-cliback todocker.ioin Dockerfiles —docker-clidoes not exist in Debian bookworm repos - Added
save_result()method toResultService—report_resultendpoint no longer accesses private_storedirectly - Fixed
docs/reference/api.mdauth columns forPATCHandDELETEagent endpoints — were incorrectly listed as unauthenticated - Fixed
docs/concepts/architecture.mdresult reporting URL from{name}to{id} - Fixed install script preflight check —
if [ $? -ne 0 ]was dead code underset -euo pipefail, replaced withif ! commandpattern - Fixed Docker container leak on health check timeout — container is now stopped before reporting failure
- Removed unused
--foregroundflag fromkitt agent startproxy command - Fixed all
docs/reference/api.mdauth columns to match actual@require_authdecorators (results, campaigns, models sections) - Updated README thin agent architecture description to reflect model resolution, Docker benchmark execution, and heartbeat command dispatch
1.2.0¶
- Agent model workflow: copy models from NFS share to local storage, benchmark, cleanup
- Per-agent settings configurable from the web UI (model storage, share mount, cleanup, heartbeat interval)
- Agent settings synced to agents via heartbeat response
- NFS share mounting support with fstab and explicit mount fallback
- Preflight prerequisite checks (
kitt-agent preflight) — Docker, GPU, drivers, NFS, disk space, connectivity - Install script runs preflight before completing installation
- Heartbeat throttling during benchmarks (auto-increases interval to 60s minimum)
cleanup_storagecommand for remote model cleanup via heartbeat dispatch- Storage usage reporting in heartbeat payload
- Removed full agent (
src/kitt/agent/) — thin agent (agent-package/) is now the only agent kitt agentCLI commands now proxy tokitt-agentbinary- Agent settings REST API endpoints (
GET/PUT /api/v1/agents/<id>/settings) - Storage cleanup REST API (
POST /api/v1/agents/<id>/cleanup) - DB migration v8:
agent_settingskey-value table per agent - Daemon refactored — consolidated duplicated run methods into shared helpers
- Version policy: every PR must increment version going forward
kitt-agent buildcommand for native-arch Docker image building- Docker container is the preferred benchmark execution method (local CLI is fallback)
- Build context API endpoint (
/api/v1/agent/build-context) - Install script auto-builds Docker image during agent installation
- Preflight check for KITT Docker image availability
1.1.0¶
- Added composable Docker deployment stacks (
kitt stack) - Added web UI and distributed agent architecture
- Added monitoring stack generation and remote deployment
- Added documentation site with MkDocs Material
- Added UI-configurable settings — Model Directory, Devon URL, and Results Directory can be edited from the Settings page with live updates
- Added inline Devon URL setup form on the Devon page
- Added searchable model dropdown to Quick Test — loads from Devon's
manifest.jsonwith fuzzy search - Added heartbeat-based command dispatch — agents pull queued quick tests via heartbeat response
- Added live SSE log streaming to Quick Test — real-time output with status progression
- Added Quick Test API endpoints for log forwarding and status updates
- Added Quick Test history page with status filtering and pagination
- Added Quick Test detail page with SSE live logs and stored log retrieval
- Added persistent log storage — log lines are saved to the database for post-run viewing
- Added
kitt-agent test listandkitt-agent test stopCLI commands for managing tests from the agent host - Fixed thin agent (
kitt-agent) log forwarding and command dispatch — heartbeat now processes queued commands,run_test/run_containerextracttest_idand forward logs and status updates to the server - Added
agent_namequery parameter to the quick test list API endpoint
1.0.0¶
- Initial release
- Multi-engine support: vLLM, TGI, llama.cpp, Ollama
- Quality benchmarks: MMLU, GSM8K, TruthfulQA, HellaSwag
- Performance benchmarks: throughput, latency, memory, warmup
- Hardware fingerprinting
- KARR results storage
- Web dashboard and REST API
- CI integration