The Invisible
Observer
Kernel-level governance for AI infrastructure. Patent-pending. We attach below the agent, in the host kernel, and read every cudaLaunchKernel and every ioctl(2). Host-native — bare metal, AWS EC2, GCP Vertex, private cloud.
Two shapes.
One kernel.
Same proprietary eBPF engine. Host-native. We deploy it.
The Sentry · HOST-NATIVE AGENT
BARE METAL · PRIVATE CLOUD · LINUX VM
The Sentry sits on every GPU host you protect — bare metal, AWS EC2, GCP Vertex, or private cloud. Two eBPF probes, one local SLM, zero exfil. We deploy it. We tune it. You see one Grafana pane.
- ✓Host-native · attaches into the host kernel via Aya eBPF
- ✓Microsecond intercept · cudaLaunchKernel + ioctl(2)
- ✓On-host Phi-3 SLM · driver-level exfil scoring
- ✓Turn-key Prometheus + Grafana dashboard · we ship it
The Auditor · FORENSIC ENGAGEMENT
AUDIT · INVESTIGATE · QUALIFY
The Auditor is a time-boxed engagement. Our engineers attach the Sentry engine to your fleet, capture a kernel-grade record of every cudaLaunchKernel and ioctl(2), and hand you a waste + exfil report with the receipts.
- ✓On-prem · read-only · no agent SDK installed
- ✓Kernel-grade event record · proprietary signed format
- ✓Zombie waste analysis · GPU-hour reclamation report
- ✓Exfil exposure assessment · ioctl + SLM scoring
Three primitives.
Built into the kernel.
Non-invasive interception. We attach below the agent — in the host kernel, in the syscall path. No Python touched. No YAML touched.
The Nvidia Hook
Total visibility into the black box of GPU compute. Microsecond intercept on cudaLaunchKernel and ioctl(2) via Aya-compiled eBPF. The agent doesn’t see us. The driver doesn’t see us. We see everything.
The Zombie Sentry
Stop paying for AI that isn't thinking. We watch the kernel-launch fingerprint of every PID. Hung agents and runaway loops get killed in real time — save up to 20% on GPU bills.
DOC · ZOMBIE SENTRY →The Exfiltration Gate
Phi-3 powered intelligence analyzing intent at the driver level. The local SLM scores every suspect ioctl against a learned exfil profile. Your PII never leaves the server.
This is what the
kernel sees.
A 4-event slice of what the Sentry emits in real time — every CUDA launch, every suspect ioctl, every Phi-3 verdict, every reclaimed dollar. The data shape below is the actual on-the-wire format your engineers will get.
- 01CUDA_LAUNCH_KERNEL — every grid / block dim is captured at the uprobe. No agent SDK. No sampling.
- 02ZOMBIE_DETECTION_TRIGGER — high-frequency static-grid loop crosses the
>1000 ops / 2sthreshold. The Sentry stages the alert. - 03IOCTL_MEM_COPY · 2 GiB — the on-host Phi-3 scores intent at 88 / 100 (“potential weight / PII smuggling”).
SIGKILLissued at the driver level. - 04RESOURCE_RECLAMATION— the zombie’s GPU-hours stop billing. Estimated savings $3.50 / hr on this single PID.
FIELDS · pid · event · kernel_addr · slm_analysis · action
TRANSPORT · loopback · no host egress · ever
You don’t install Arca.
We do.
Kernel-level instrumentation is not a curl-bash. Our engineering team deploys the Sentry into your hosts, tunes the eBPF attach for your driver and kernel build, and hands you a turn-key Grafana pane.
Discovery
One scoping call. Hosts, kernel versions, GPU fleet, where your audit logs land. We confirm fit before anyone touches a box.
30 MIN · NDADeployment
Our engineers attach the Sentry into the host kernel — bare metal, AWS EC2, GCP Vertex, or private cloud. Zero code changes to your agents. Zero YAML in your stack.
ON-PREM · WHITE-GLOVEThe Pane
You get a turn-key Grafana dashboard fed by Prometheus on loopback — kernel launches, zombie kills, exfil scores. One pane. Yours forever.
GRAFANA · PROMETHEUS · YOURSWe do not sell a download. We sell a deployed, tuned, signed-off governance layer with a real engineer on call. That is the product.
↗ START INTEGRATIONThe Sentry on every GPU
you ship.
We deploy the Sentry across your hosts — bare metal, AWS EC2, GCP Vertex, private cloud — without touching a single line of your training, inference, or notebook code. Then we keep it tuned.
DISCUSS ENTERPRISE TERMS →Automatic instrumentation
When traffic surges, the eBPF hot path absorbs additional CUDA calls without dropping events or affecting GPU job latency. When the fleet is quiet, the Sentry quiesces. Zero code in your agent. Zero YAML in your stack.
Per-engagement, per-host
The Auditor is a scoped forensic engagement — priced per fleet, per audit window. The Sentry is continuous defense — priced per host, per year, with our engineering team on call. Air-gap and sovereign-AI terms available.
Air-gap and offline first
The Sentry runs as a single signed binary on each host with a TOML config and a local GGUF model file. No control plane. No phone-home. Air-gapped fleets are first-class — our engineers install through your artifact registry or USB rotation.
Turn-key Grafana, SIEM ready
Every kernel-launch and ioctl the Sentry sees lands in structured logs and Prometheus counters on loopback, surfaced through the Grafana pane we ship. SIEM exporters — Splunk, Datadog, Loki, OpenTelemetry — are part of the enterprise integration.
Designed for 4,416 GPUs.
Eleven regions. Sample fleet.
One pane.
Every kernel across the fleet.
The turn-key Grafana pane we ship with every Sentry deployment. One pane. Every kernel across the fleet.
Where kernel-level matters.
High-Compliance Healthcare
Kernel-grade record of every GPU launch and ioctl — the evidence chain HIPAA, NOM-024, and FDA SaMD reviews keep asking for. We deploy. You audit.
Financial Services
Black-box agentic leaks die at the ioctl boundary. The on-host Phi-3 SLM scores intent. Your weights and your customer PII never leave the server.
Autonomous Robotics
Microsecond visibility into the GPU compute path on safety-critical hosts. Bare metal. AWS EC2. GCP Vertex. No agent SDK. No code changes.