Skip to content

Monitoring

Operators need to know that every layer is alive and within latency budgets. Three probes per layer, all exposed via HTTP /health and Prometheus.

Health probes

Layer	Endpoint	What it checks
L1	`api.useoris.xyz/v1/health/l1`	CCIP-Read gateway reachable, latest block within 5 minutes
L2	`api.useoris.xyz/v1/health/l2`	Policy engine cache hit rate, p95 latency under 10 ms
L3	`api.useoris.xyz/v1/health/l3`	Veris gRPC ping, sanctions feed freshness
L5	`api.useoris.xyz/v1/health/l5`	Tree builder flush cadence, root commit lag
L6	`api.useoris.xyz/v1/verify/health`	Verifier signature throughput, pubkey availability
L7	`api.useoris.xyz/v1/audit/health`	Hash chain head age, anchor lag

SLA targets

Metric	Target
Bundle assembly p95	< 50 ms
Policy evaluation p95	< 10 ms
Veris attestation p50	4.4 ms
Verifier verdict p95	< 18 ms
Audit anchor lag	< 1 hour
Sanctions feed freshness	< 6 hours

Alert routing

Three severity levels:

CRIT — any layer down or SLA breach above threshold. Pagerduty.
WARN — degraded performance, single source outage, or partial cache miss spike. Slack.
INFO — anchor lag, scheduled rotation, planned downtime. Logged.

Where to go next

Incident response for the runbook when probes fail.
Deployment for the layer-by-layer architecture.
Key rotation for scheduled rotation procedures.