// FEATURES — KERNEL-LEVEL AISPM

Three primitives.
Built into the kernel.

The Hook. The Sentry. The Gate. One proprietary engine, attached below the agent, in the host kernel, via Aya-compiled eBPF. Read the spec.

FEATURE 01 →The Nvidia Hook FEATURE 02 →The Zombie Sentry FEATURE 03 →The Exfiltration Gate

FEATURE 01VISIBILITY · KERNEL · eBPF

The Nvidia Hook

Total visibility into the black box of GPU compute. Microsecond intercept on every CUDA launch and every ioctl.

The Hook attaches below the agent — in the host kernel — via Aya-compiled eBPF. A uprobe on libcuda.so:cudaLaunchKernel records every GPU launch. The syscalls:sys_enter_ioctl tracepoint records every driver-bound ioctl. The agent doesn't see us. The driver doesn't see us. We see everything.

Granularity

per-syscall · per-launch

Overhead

sub-1% target · benchmarks pending

Coverage

cudaLaunchKernel + ioctl(2)

Surface

Bare metal · AWS EC2 · GCP Vertex · private cloud

// TECHNICAL SPECIFICATION

Mechanism

eBPF uprobe + tracepoint via Aya (Rust)

Attach point

libcuda.so:cudaLaunchKernel + syscalls:sys_enter_ioctl

Event payload

POD struct · kernel timestamp · resolved tgid

Buffer

aya RingBuf · 256 KiB · per program

Footprint

Host-native · no agent SDK · no YAML · no code changes

FEATURE 02ROI · WASTE · KILL

The Zombie Sentry

Stop paying for AI that isn't thinking. Save up to 20% on GPU bills by killing hung agents in real time.

Forgotten notebooks. Runaway loops. Agents stuck firing the same kernel a thousand times a second while the bill clock runs. We watch the kernel-launch fingerprint of every PID through the eBPF ring buffer. When a process crosses the threshold, we alert — and if policy says so, the kernel signals SIGKILL within milliseconds. Your CFO gets the GPU-hours back.

ROI

up to 20% reclaim on GPU spend

Default threshold

1000 identical launches / 2s

Latency

real-time · sub-second decision

Auto-kill

policy-driven · alert by default

// TECHNICAL SPECIFICATION

Signal

kernel-launch rate per (pid, fn_ptr, grid+block dims)

Heuristic

identical-launch sliding window · O(1) per event

Action

log alert · SIGKILL when policy = kill

Reporting

Prometheus counters + the turn-key Grafana pane

Override

allowlist + protected_pids · tuned per-host

FEATURE 03PRIVACY · LOCAL AI · ZERO EGRESS

The Exfiltration Gate

Phi-3 powered intelligence analyzing intent at the driver level. Your PII never leaves the server.

Every suspect ioctl gets two passes. Stage one is a kernel-side heuristic on the hot path: size estimate plus reservoir sample, sub-microsecond. Stage two hands the survivors to an on-host Phi-3 mini that scores intent against a learned exfil profile and returns a 0–100 risk rating with a reason. The model runs on the same host as the workload — no cloud round-trip, no third-party API. Your weights, your customer PII, your competitive data: never out the door.

Stage 1

size + xorshift sample · sub-µs hot path

Stage 2

Phi-3 mini Q4_K_M · on-host · greedy decode

Action

alert | SIGKILL · policy-driven

Privacy

no host egress · no cloud · no third party

// TECHNICAL SPECIFICATION

S1 · Heuristic

size_estimate ≥ heuristic_min_bytes · sample_rate (xorshift)

S2 · SLM

Phi-3 mini · GGUF · llama-cpp-2 · CPU or GPU

Risk threshold

configurable 0–100 · default 70

Block action

alert (log) | SIGKILL pid (protected_pids excluded)

Update channel

signed model bundle · we ship · air-gap safe

Three primitives.Built into the kernel.

The Nvidia Hook

The Zombie Sentry

The Exfiltration Gate

Three primitives.
Built into the kernel.