Guardrails
Anti-hallucination, approval, and review controls
The system blocks unsafe outgoing replies, keeps high-risk actions behind approval, and records reviewable traces for every weak-confidence path.
Pre-send Validator
- Forbidden persona phrases
- Unapproved prices
- Overlong replies
Action Approval
- Payment confirmation
- Lifecycle updates
- High-value enrollments
Review Queue
- Low confidence
- Blocked reply
- Admin audit
- KB correction
Usage Metering
- Messages
- AI tokens
- Media classification
- Storage
- Handoff count
Current Safe Default
Payment confirmation, lifecycle writes, and high-value enrollment changes stay admin-approved until live data proves they are safe.