Walking-Skeleton MVP

AegisOps

Production SRE / Ops Agent

Runs typed operational runbooks through a semantic policy engine against mock target systems. Every step recorded in a hash-chained, tamper-evident audit log.

by Nguyen Dinh Quang Khanh
auto/ask/deny
Policy Verdicts
semantic rules
SHA-256
Audit Chain
verify_chain()
1
Default Runbook
service.crash_loop
8003
Service Port
FastAPI /docs

Aegis-Ops

Runs typed operational runbooks through a semantic policy engine against mock target systems. Every step recorded in a hash-chained, tamper-evident audit log.

🛡️
PolicyEngine
operation × target × condition — hard deny wins; unmatched defaults to ask.
📜
Runbook Engine
typed Runbook steps; cleanly stops on deny / tool error.
🔒
Hash-chained Audit
SHA-256 linked entries; verify_chain() detects tampering.

Architecture Overview

End-to-end pipeline from request intake through policy gating, tool execution, and audit emission.

flowchart LR I[Incident] --> Eng[RunbookEngine] Eng --> Pol{PolicyEngine} Pol -->|auto| Tgt[TargetSystem] Pol -->|ask| H[Human approve] Pol -->|deny| Stop[Halt + audit] Tgt --> Aud[AuditLog]

Components

Click any component to expand its design decisions and implementation highlights.

TargetSystem
env
in-memory mock cluster (services + db)
PolicyEngine
guard
semantic rules: auto / ask / deny
RunbookEngine
core
executes typed steps through policy
AuditLog
audit
SHA-256 hash-chained entries
IncidentStore
state
in-memory record keyed by incident id
FastAPI
surface
POST /v1/incidents · GET /v1/audit

Quickstart

Up and running in under two minutes. Requires Python 3.11+ and Make.

terminal
# clone and install
$ git clone https://github.com/ndqkhanh/aegis-ops
$ cd aegis-ops
$ make install

# run the test suite
$ make test

# start the FastAPI service
$ make run

# smoke-test the API
$ curl http://localhost:8003/healthz

Design Docs

Architecture decisions, trade-offs, system design, and block-level diagrams.