AI Guardrails: Gateway Architecture
This architecture outlines a unified entry point (“One endpoint. Seven guardrails. Three write paths. Every call.”) designed to run on top of LLM clients to block attacks, manage permissions, and govern costs.

Core Request Flow
The client sends a request to the main API endpoint:
POST /v1/infer Orchestrator 7 Sequential Guardrails Bedrock Converse (Haiku, Sonnet, Nova) Outputs routed to CloudWatch Logs and a unified_log database.
The 7 Guardrails (Sequential Execution)
- Identity
- Validates user identity, permissions, and context. Checks who is making the call.
- Prompt
- Resolves and fetches the prompt manager’s rules. Prompts are stored in the database.
- Tools
- Resolves tool registrations. Implements a security boundary: not all prompts/users can query all tools or MCP servers.
- Skills
- Loads post-processing and formatting extensions (e.g., a “Humanize” skill to format model replies to sound natural).
- Log
- Unified call logger capturing request metadata.
- Cost
- Real-time token counter and cost calculation to prevent abuse or budget overruns.
- Bedrock Guardrails
- Final safety layer checking the request/response for content moderation, toxic language, and PII leaks.
Database & Write Paths
The orchestrator reads prompt templates, tool schemas, and routing logic from DynamoDB (storing Prompts, Tools, Skills, and Routing). To minimize latency, there is a 30-second TTL cache between DynamoDB and the Orchestrator.
Changes to this database come from three distinct write paths:
- Platform Team: Monthly deployments (stable infrastructure & core schemas).
- Product Team: Weekly Pull Requests (application-specific prompt/tool updates).
- Business/Ops: Daily direct DynamoDB writes (hot-fixes, copy tweaks, operational variables).