ARCA.VISION
PATENT-PENDING · KERNEL-LEVEL GOVERNANCE

The Invisible
Observer

Kernel-level governance for AI infrastructure. Patent-pending. We attach below the agent, in the host kernel, and read every cudaLaunchKernel and every ioctl(2). Host-native — bare metal, AWS EC2, GCP Vertex, private cloud.

RUST · AYA · eBPF·HOST-NATIVE·NON-INVASIVE INTERCEPTION·WHITE-GLOVE DEPLOYED
RUN// LIVE ENGINEhost-native · live · 0 / 11
10:24:46arca engineer · white-glove window · openonline
10:24:46Sentry binary · signed · launching on hostverified
10:24:47Attach · uprobe · libcuda.so:cudaLaunchKernel1 of 2
10:24:47Attach · tracepoint · syscalls:sys_enter_ioctl2 of 2
10:24:48Ring buffer · EVENTS · per-program · readyarmed
10:24:48Ring buffer · IOCTLS · per-program · readyarmed
10:24:49Zombie Sentry · sliding window · tunedarmed
10:24:49Exfil Gate · heuristic + on-host SLMarmed
10:24:50Phi-3 mini · loaded · greedy decodeready
10:24:50Grafana pane · provisioned · loopbacklive
10:24:51Engineer signs off · on callREADY
01 · INTERCEPTMicrosecond visibilityTotal visibility into the black box of GPU compute. Every CUDA launch. Every ioctl.
02 · GOVERNPhi-3 driver-level gateOn-host intelligence analyzing intent at the syscall boundary. Your PII never leaves the server.
03 · RECOVERZombie kill switchStop paying for AI that isn't thinking. Save up to 20% on GPU bills by killing hung agents in real-time.
04 · INTEGRATEWhite-glove deployedWe deploy. We tune. You get a turn-key Grafana dashboard.
// PRODUCTS — THE DELIVERY LAYER

Two shapes.
One kernel.

Same proprietary eBPF engine. Host-native. We deploy it.

The Sentry · HOST-NATIVE AGENT

BARE METAL · PRIVATE CLOUD · LINUX VM

The Sentry sits on every GPU host you protect — bare metal, AWS EC2, GCP Vertex, or private cloud. Two eBPF probes, one local SLM, zero exfil. We deploy it. We tune it. You see one Grafana pane.

$# arca.vision handles the integration
  • Host-native · attaches into the host kernel via Aya eBPF
  • Microsecond intercept · cudaLaunchKernel + ioctl(2)
  • On-host Phi-3 SLM · driver-level exfil scoring
  • Turn-key Prometheus + Grafana dashboard · we ship it
REQUEST SENTRY DEPLOYMENT →

The Auditor · FORENSIC ENGAGEMENT

AUDIT · INVESTIGATE · QUALIFY

The Auditor is a time-boxed engagement. Our engineers attach the Sentry engine to your fleet, capture a kernel-grade record of every cudaLaunchKernel and ioctl(2), and hand you a waste + exfil report with the receipts.

$# scoped engagement · contact engineering
  • On-prem · read-only · no agent SDK installed
  • Kernel-grade event record · proprietary signed format
  • Zombie waste analysis · GPU-hour reclamation report
  • Exfil exposure assessment · ioctl + SLM scoring
REQUEST AN AUDIT →
// FEATURES — KERNEL-LEVEL AISPM

Three primitives.
Built into the kernel.

Non-invasive interception. We attach below the agent — in the host kernel, in the syscall path. No Python touched. No YAML touched.

FEATURE 01

The Nvidia Hook

Total visibility into the black box of GPU compute. Microsecond intercept on cudaLaunchKernel and ioctl(2) via Aya-compiled eBPF. The agent doesn’t see us. The driver doesn’t see us. We see everything.

cudaLaunchKernelsys_enter_ioctl
DOC · NVIDIA HOOK →
FEATURE 02

The Zombie Sentry

Stop paying for AI that isn't thinking. We watch the kernel-launch fingerprint of every PID. Hung agents and runaway loops get killed in real time — save up to 20% on GPU bills.

!!!GPU 0–23 · idle <5% util3 ZOMBIE
DOC · ZOMBIE SENTRY →
FEATURE 03

The Exfiltration Gate

Phi-3 powered intelligence analyzing intent at the driver level. The local SLM scores every suspect ioctl against a learned exfil profile. Your PII never leaves the server.

STAGE 1heuristicSTAGE 2SLM→ ioctlALERT / SIGKILL
DOC · EXFIL GATE →
// LIVE SENTRY FEED — KERNEL HEARTBEAT

This is what the
kernel sees.

A 4-event slice of what the Sentry emits in real time — every CUDA launch, every suspect ioctl, every Phi-3 verdict, every reclaimed dollar. The data shape below is the actual on-the-wire format your engineers will get.

LIVE// KERNEL HEARTBEAT · ARCA SENTRY0 / 4 events
OBSERVED0
BLOCKED0
RECLAIMED
20:45:01.002pid 14468
CUDA_LAUNCH_KERNELOBSERVED
kernel_addr 0xcafe000agrid_dim [32, 1, 1]block_dim [1024, 1, 1]
20:45:01.450pid 14468
ZOMBIE_DETECTION_TRIGGERPENDING ALERT
reason High-frequency static grid loop detected·severity CRITICAL·threshold >1000 ops / 2s
20:45:02.110pid 18290
IOCTL_MEM_COPYSIGKILL ISSUED
size 2.00 GiBslm.model Phi-3-Minislm.risk_score 88 / 100
intent: Unusual volume: Potential Weight/PII Smuggling
20:45:03.500pid 14468
RESOURCE_RECLAMATIONRECLAIMED
savings_estimate $3.50/hr·status ZOMBIE_TERMINATED
Verdict·1 zombie killed · 1 exfil blocked · $3.50/hr reclaimed · 0 host egress.replay in 3s
// READ THE TAPE
  • 01CUDA_LAUNCH_KERNEL — every grid / block dim is captured at the uprobe. No agent SDK. No sampling.
  • 02ZOMBIE_DETECTION_TRIGGER — high-frequency static-grid loop crosses the >1000 ops / 2s threshold. The Sentry stages the alert.
  • 03IOCTL_MEM_COPY · 2 GiB — the on-host Phi-3 scores intent at 88 / 100 (“potential weight / PII smuggling”). SIGKILL issued at the driver level.
  • 04RESOURCE_RECLAMATION— the zombie’s GPU-hours stop billing. Estimated savings $3.50 / hr on this single PID.
FORMAT · NDJSON over the integration channel
FIELDS · pid · event · kernel_addr · slm_analysis · action
TRANSPORT · loopback · no host egress · ever
// INTEGRATION — WHITE-GLOVE

You don’t install Arca.
We do.

Kernel-level instrumentation is not a curl-bash. Our engineering team deploys the Sentry into your hosts, tunes the eBPF attach for your driver and kernel build, and hands you a turn-key Grafana pane.

STEP 01

Discovery

One scoping call. Hosts, kernel versions, GPU fleet, where your audit logs land. We confirm fit before anyone touches a box.

30 MIN · NDA
STEP 02

Deployment

Our engineers attach the Sentry into the host kernel — bare metal, AWS EC2, GCP Vertex, or private cloud. Zero code changes to your agents. Zero YAML in your stack.

ON-PREM · WHITE-GLOVE
STEP 03

The Pane

You get a turn-key Grafana dashboard fed by Prometheus on loopback — kernel launches, zombie kills, exfil scores. One pane. Yours forever.

GRAFANA · PROMETHEUS · YOURS

We do not sell a download. We sell a deployed, tuned, signed-off governance layer with a real engineer on call. That is the product.

↗ START INTEGRATION
// SCALE — FROM ONE HOST TO THE FLEET

The Sentry on every GPU
you ship.

We deploy the Sentry across your hosts — bare metal, AWS EC2, GCP Vertex, private cloud — without touching a single line of your training, inference, or notebook code. Then we keep it tuned.

DISCUSS ENTERPRISE TERMS →

Automatic instrumentation

When traffic surges, the eBPF hot path absorbs additional CUDA calls without dropping events or affecting GPU job latency. When the fleet is quiet, the Sentry quiesces. Zero code in your agent. Zero YAML in your stack.

The Auditor · scoped auditper engagement · contact
The Sentry · single fleetper host · contact
The Sentry · enterpriseper host · contact
Air-gap / sovereign AIannual · contact
Custom SIEM exportersscoped · contact
On-call engineeringincluded · enterprise

Per-engagement, per-host

The Auditor is a scoped forensic engagement — priced per fleet, per audit window. The Sentry is continuous defense — priced per host, per year, with our engineering team on call. Air-gap and sovereign-AI terms available.

Air-gap and offline first

The Sentry runs as a single signed binary on each host with a TOML config and a local GGUF model file. No control plane. No phone-home. Air-gapped fleets are first-class — our engineers install through your artifact registry or USB rotation.

54412714006:58 GMT07:15 GMT07:32 GMT07:49 GMT08:06 GMT08:23 GMT08:40 GMT08:58 GMT

Turn-key Grafana, SIEM ready

Every kernel-launch and ioctl the Sentry sees lands in structured logs and Prometheus counters on loopback, surfaced through the Grafana pane we ship. SIEM exporters — Splunk, Datadog, Loki, OpenTelemetry — are part of the enterprise integration.

// FIELD — HOST-NATIVE BY DESIGN

Designed for 4,416 GPUs.
Eleven regions. Sample fleet.

// SAMPLE FLEET · ILLUSTRATIVE·UPDATED 3s AGO·HOOKS/sec 2.4M·HOSTS 4,416SOURCE · sample fleet · not from a live deployment
US-WEST · 412 GPUUS-EAST · 1,204 GPUMX-MTY · 96 GPUBR-SAO · 184 GPUEU-LON · 240 GPUEU-FRA · 820 GPUME-UAE · 1,840 GPUIN-BLR · 480 GPUAP-SIN · 308 GPUAP-TOK · 612 GPUAU-SYD · 220 GPU
GPUs MONITORED4,416
HOOKS · 24h208B
EXFIL EVENTS FLAGGED14,209
COMPUTE RECLAIMED$1.84M
// LIVE OPS — THE PANE

One pane.
Every kernel across the fleet.

The turn-key Grafana pane we ship with every Sentry deployment. One pane. Every kernel across the fleet.

LIVE·FLEET
HOOKS2.4M / s
OVH<1%
FLAGGED14,209
RECLAIMED$1.84M
GPUs4,416
ZOMBIES3
P9578µs
RINGOK
arca-fleet · v2.0 · streaming
▸ ENGINE ACTIVITY · LIVEstreaming
13:09:38score ::SLM · risk_score=12 · lowok 244ms
13:09:39score ::SLM · Phi-3 mini · greedy decode · 96 tokensok 571ms
13:09:40detect ::stage 1 · large ioctl · forwarded to SLMok 518ms
13:09:41detect ::stage 1 · size_estimate < 100 MB · dropok 465ms
13:09:42detect ::sliding window · (pid, fn_ptr, dims) seenok 412ms
13:09:43ring ::IOCTLS · IoctlEvent submittedok 359ms
13:09:44ring ::EVENTS · KernelLaunchEvent submittedok 306ms
13:09:45hook ::tracepoint · syscalls:sys_enter_ioctlok 253ms
13:09:46hook ::uprobe · libcuda.so:cudaLaunchKernelok 200ms
2 probes · 2 ring buffers · 1 SLM worker
▸ INCIDENT INTEL · LIVE4 active signals
142 idle SMs across train-prod-* nodesMED
2m ago·auto-tag · reclaim queuedPRO
Forgotten notebook · 1000 launches / 2 s windowHIGH
11m ago·zombie detector · alert firedPRO
SLM gate · risk_score 84 on a 14 GB ioctlMED
14m ago·stage-2 forward · routed to SOCPRO
Large ioctl from short-lived pid · stage 1 passHIGH
24s ago·alert · mode=alert defaultPRO
▸ ENGINE TELEMETRY · LIVEstreaming
EVENTS · 24h208B
FLAGGED · 24h14,209
SLM EVALS · 24h48,302
ATTACHED HOSTS4,416
PROBE LOSS0.00%
RING · STATEOK
▸ STRESS RADAR · BY REGION1 region elevated
ASIA-PAC
HIGHCN / JP / KR
EUROPE
MEDDE / NL / SE
INDIA
MEDIND
AFRICA
MEDZA / DRC
AMERICAS
LOWUS / CA / BR
OCEANIA
LOWAU
// USE CASES — THE VERTICAL WEDGE

Where kernel-level matters.

+
STAGE 01

High-Compliance Healthcare

Kernel-grade record of every GPU launch and ioctl — the evidence chain HIPAA, NOM-024, and FDA SaMD reviews keep asking for. We deploy. You audit.

HIPAANOM-024FDA · SaMD
▸ PRESS START
$
STAGE 02

Financial Services

Black-box agentic leaks die at the ioctl boundary. The on-host Phi-3 SLM scores intent. Your weights and your customer PII never leave the server.

SOC 2 · TYPE IIPCI-DSSFFIEC
▸ PRESS START
STAGE 03

Autonomous Robotics

Microsecond visibility into the GPU compute path on safety-critical hosts. Bare metal. AWS EC2. GCP Vertex. No agent SDK. No code changes.

ISO 26262IEC 61508DO-178C
▸ PRESS START
// DEPLOY

Put the
invisible observer
on your hosts.