FEATURE 01VISIBILITY · KERNEL · eBPF

The Nvidia Hook

A host-observable model of GPU behavior, built by correlating four signals — ioctl traffic to /dev/nvidia*, libcuda uprobes, NVML device state, and /proc process metadata.

The Hook attaches in the host kernel via a Rust eBPF runtime. A uprobe on libcuda.so:cudaLaunchKernel and cudaMalloc records host-side call sites with kernel symbol resolution. The syscalls:sys_enter_ioctl tracepoint, filtered to /dev/nvidia*, records the driver-bound request stream. NVML reports per-PID device state; /proc anchors the process context. The four signals correlate into one continuously-attributed record. What we don't claim: SM-level execution traces, hardware command-queue state, or L1/L2 telemetry — those require GPU-vendor instrumentation (CUPTI / Nsight) and are orthogonal to the security, compliance, and cost use cases this product addresses.

Signals

ioctl · libcuda uprobes · NVML · /proc

Granularity

per-syscall · per-launch · per-PID

Overhead

sub-1% target · benchmarks pending

Surface

Bare metal · AWS EC2 · GCP Vertex · private cloud

// TECHNICAL SPECIFICATION

Mechanism

eBPF uprobe + tracepoint · Rust runtime

Attach point

libcuda.so:cudaLaunchKernel + syscalls:sys_enter_ioctl

Correlation inputs

ioctl(/dev/nvidia*) · libcuda · NVML · /proc/<pid>

Graphs API

cudaGraphLaunch tracked separately · arca_gpu_cuda_graphs_launched_total

Event payload

POD struct · kernel timestamp · resolved tgid

Buffer

aya RingBuf · 256 KiB · per program

Footprint

Host-native · no agent SDK · no YAML · no code changes

Scope we don't claim

SM-level execution · GPU command-queue state · L1/L2 telemetry