Products · Introspect
Your cluster's X-ray
Discover every model, runtime, and batch configuration across your fleet — automatically. No agents to install, no instrumentation to maintain.
- Auto-discovers vLLM, Triton, KServe, SGLang, Ollama, TGI
- Per-workload memory shape, KV cache profile, batch dynamics
- Per-workload CPU:GPU ratio monitoring — catches imbalances invisible at the cluster level
- Fleet-wide visibility into model deployments, replicas, and runtime configuration
- Zero-touch deployment — read-only by default
introspect.yaml
discover: runtimes: [vllm, triton, kserve, sglang] depth: deep emit: - workload.memory_shape - workload.kv_cache_profile - workload.batch_dynamics mode: read-only