ARCA.VISION
// FEATURE · THE PERSONA SWITCHBOARD · PHASE 6

Intelligence is hot-swappable.

Why settle for one-size-fits-all security? The same proprietary eBPF engine that runs the Exfiltration Gate now hot-swaps its on-host SLM at the ioctl boundary for compliance personas today, with more on the rack. Local SLM Governance. Zero downtime. Air-gap delivered.

SECTION 01RACK · 8 BAYS · HOT-SWAP

The rack-mount.

Eight cartridges. One active at a time. Tap a bay to hot-swap the persona that scores every suspect ioctl on the host.

Most AI security tools ship a single, frozen detection model. Arca.Vision treats the SLM as a cartridge: the kernel boundary stays put, the on-host inference gate stays put, and the persona (the thing that knows what your sector counts as a leak) slides in or out. PHI signals are not PCI signals are not weapons-systems CUI. The persona changes; the engine does not.

$arca persona swap --bay 03 --to dod-sentinel
Bays
8 · one active at a time
Swap path
pointer flip · no probe detach
Downtime
0 ms · ring buffers stay armed
Delivery
signed bundle · air-gap safe
// RACK · MAIN BAY·ACTIVE HIPAA Guardian·p99 1.2 ms
FORMATS · GGUF · LoRA▸ TAP A LIVE BAY TO HOT-SWAP
Active persona cartridge

// interactive · tap a bay to degauss-swap the active cartridge · LED indicates load state

PERSONA LATENCY · DESIGN TARGET (ms)REF 2.0 ms0.51.01.52.02.5HIPAA · 1.2 msHEADROOM40%UNDER TARGET

// illustrative design targets · live cursor sweeps the gauge · GPU-backed deployments verdict in well under a second

SECTION 02SUB-SECOND · GPU

Context-aware latency.

GPU-backed deployments return a persona verdict in well under a second; CPU-only hosts target seconds, not minutes.

Every cartridge ships as a Q4_K_M GGUF sized to fit a single sys_enter_ioctl budget. Stage 1, the kernel-side heuristic, runs in sub-microsecond time and only forwards qualifying ioctls to the persona. Stage 2, the SLM verdict, is greedy-decoded on host, with the result returned over a loopback channel before the next bursty event lands. Customer-specific p50/p95/p99 are produced during the integration window.

// SECTION 03 · UNDER THE HOOD

Three things that make
hot-swap actually safe.

The Persona Switchboard isn't model-routing in userspace. It's a kernel-adjacent inference gate orchestrated by our Rust control plane, with strict memory isolation between observer and observed.

PROOF 01

Rust-orchestrated load · zero downtime

SLM weights are orchestrated by our Rust-based control plane and served to the local inference gate over a loopback channel. Hot-swap is a pointer flip, not a process restart. The eBPF probes never detach. The kernel-side heuristic keeps absorbing events the entire time.

PROOF 02

Strict memory isolation · observer ≠ observed

Each persona runs in a strictly isolated memory space, separated from the workload the Sentry is governing. The observer cannot be compromised by the observed: the SLM lives behind the same kernel boundary as the rest of the gate, and a compromised CUDA process has no read/write path to the persona weights.

PROOF 03

Model-agnostic · GGUF today · LoRA on the roadmap

Cartridges ship as signed bundles. Today we run llama-cpp-2 with GGUF (Q4_K_M is the default). LoRA-adapter cartridges, including the Custom LoRA slot, are on the cartridge roadmap. All inference is local with no cloud round-trip; air-gap deployments are first-class.

// SECTION 04 · CARTRIDGE CATALOG

Eight cartridges.
One eBPF kernel.

One cartridge per regulated sector, plus an empty slot you can fill with your own LoRA. Click a cartridge to read the use case.

STANDBYHIPAA
HIPAA GuardianUS Healthcare · PHI
Model
Phi-3 mini · Q4_K_M
p99 latency
1.2 ms
HIPAA §164.312HITECH §1340242 CFR Part 2FDA · SaMD

Tuned for PHI patterns, ICD-10 codes, NPI numbers, and EHR exfil signatures. Catches PHI smuggling at the cudaMemcpyDeviceToHost boundary before the bytes leave VRAM.

FORMAT · GGUFVIEW →
STANDBYLATAM HEALTH
LATAM Health SentinelLATAM Healthcare · es-MX / pt-BR
Model
Phi-3 mini · bilingual
p99 latency
1.4 ms
NOM-024-SSA3ANPD · LGPDGDPR · Art. 9Ley 26.529

Bilingual PHI scoring with Mexican CURP, Brazilian CPF, and LATAM clinical record templates baked into the prompt. Air-gap-deployed across MX, BR, AR, and CO health networks.

FORMAT · GGUFVIEW →
STANDBYDoD
DoD SentinelDefense · IL5 / IL6
Model
Phi-3 mini · Q4_K_M
p99 latency
1.6 ms
FedRAMP HighDoD IL5 / IL6NIST SP 800-171CMMC L3

Air-gapped persona tuned for classified-handling guidance. Recognizes CUI markings, weapons-system telemetry, and cleared-contractor egress patterns. Signed by Arca engineering. A LoRA variant is on the cartridge roadmap.

FORMAT · GGUFVIEW →
ON THE RACK · 2026SOVEREIGN
Sovereign AI WardenGovernment · Civilian
Model
Phi-3 mini · Q4_K_M
p99 latency
1.3 ms
FedRAMP ModFISMAeIDASISO 27001+1

Sovereign-cloud aware. Watches citizen PII patterns across SSA, IRS, DMV, and EU eIDAS attestation chains. Designed for federal civilian and StateRAMP deployments.

FORMAT · GGUFROADMAP →
ON THE RACK · 2026ROBOTICS
Robotics Safety OfficerAutonomous · Safety-Critical
Model
TinyLlama 1.1B · Q4
p99 latency
0.9 ms
ISO 26262 · ASIL-DIEC 61508DO-178CUN R155+1

Functional-safety persona. Flags actuator-bound ioctls, watchdog skips, and CAN-bus exfil patterns. Sub-millisecond decision budget: the safety loop cannot wait.

FORMAT · GGUFROADMAP →
ON THE RACK · 2026PII
PII RedactorFinance · General PII
Model
Phi-3 mini · Q4_K_M
p99 latency
1.1 ms
PCI-DSSGDPRCCPANYDFS · Part 500+1

Card numbers, SSN, IBAN, BIK, account routing, and global financial PII patterns. The default cartridge for trading desks, fintech, and any cluster touching customer data.

FORMAT · GGUFROADMAP →
ON THE RACK · 2026CODE
Code SentinelIP · Trade Secrets
Model
DeepSeek-Coder · LoRA
p99 latency
1.5 ms
Trade Secrets ActDTSAEU TS DirectivePatent IP

Embedding-aware persona. Detects model-weight dumps, proprietary algorithm leakage, and source-code exfil: the failure mode that loses you the company, not the lawsuit.

FORMAT · LoRAROADMAP →
EMPTYCUSTOM
Custom LoRABring your own
Model
Open · GGUF today · LoRA roadmap
p99 latency
n/a
YOUR · POLICY

Empty cartridge slot. Co-design with our engineers using your audit logs, your regulator language, your domain. Signed bundle, air-gap delivered.

FORMAT · BYO · GGUF / LoRABUILD →
// SECTION 05 · DYNAMIC COMPLIANCE & SHIELDING

Two personas.
One multi-tenant cluster.

A scenario you can run today: a single GPU fleet serving regulated tenants under different rule sets, with a different persona enforced per namespace.

// SCENARIO

Dynamic Compliance & Shielding

A multi-tenant H100 cluster serves Healthcare workloads in Namespace-A and Finance workloads in Namespace-B from the same physical fleet.

The Sentry loads HIPAA Guardian on Namespace-A and PII Redactor on Namespace-B from the same kernel attach. Each tenant gets policy enforcement that understands intent, not just regex. Hallucinated data leaks die at the cudaMemcpyDeviceToHost boundary, before bytes ever leave VRAM.

// REGULATIONS COVERED
HIPAA §164.312HITECH §13402PCI-DSSGDPRCCPANYDFS Part 500
STEP 01

Namespace-A · HIPAA Guardian

PHI exfil scoring · HIPAA §164.312 mapping

STEP 02

Namespace-B · PII Redactor

PCI-DSS / GDPR / CCPA pattern scoring

STEP 03

Shared kernel · isolated personas

Two SLMs · one Sentry · zero cross-tenant leakage

STEP 04

Hot-swap on policy change

No agent restart · no Helm rollover · no downtime

// BUILD YOUR PERSONA

Pick a cartridge.
Or co-design a custom LoRA with us.

Every persona is signed and air-gap delivered. Custom cartridges are built with our engineering team using your audit logs and your regulator language, typically 4–6 weeks from kickoff to a signed bundle on the host.