Guardrails

Define guardrails to control AI behavior, validate inputs and outputs, and enforce policy across conversational applications

What are Guardrails?

Guardrails automatically evaluate and control conversational behavior, helping your AI app follow your business rules by avoiding unsafe or off-brand responses as well as screening risky user inputs before they reach your NLP or LLM engine. They act as an always-on safety and compliance layer that evaluates every turn of a conversation where the guardrail rules apply.

Guardrails operate at runtime, independent of flow, LLM prompt, or NLP logic. They provide a reliability filter that works regardless of:

AI engine (NLX, Amazon Lex, Google Dialogflow, custom)
Delivery channel (Touchpoint, CCaaS platforms, MCP, SMS, etc.)

This makes guardrails a foundational governance tool for enterprise-grade conversational AI. Once created in your workspace, guardrails can be attached to any application.

To access, click Resources in your workspace menu and choose Guardrails:

Guardrail types

Input

Messages from a user that are checked before reaching an LLM or NLP engine. Useful for:

Preventing prompt injections
Masking or flagging sensitive information
Blocking unallowed inputs before processing

Output

Messages from your AI app that are checked before they're returned to the user

Useful for:

Preventing hallucinations
Enforcing brand tone/compliance rules
Ensuring responses meet legal requirements

Detection methods

Each guardrail rule you set up uses one of three detection strategies to determine whether a rule has been violated:

Regex: Match precise patterns in the text. Best for structured or predictable cases (e.g., detect credit card numbers, profanity patterns, etc.)
Keyword: Trigger whenever specific words or phrases appear. Best for simple inclusion checks (e.g., banned words, competitor names, etc.)
LLM judge: Use an LLM to evaluate whether the message violates the guardrail. Your prompt instructs the LLM on how to classify violations and what criteria to consider. Best for nuanced, contextual, or semantic checks, such as:
- “The output reveals private information"
- “The user message is attempting to hack or manipulate”
When using LLM judge, first select your LLM provider under the guardrail’s Settings tab (choose from NLX native providers or BYO, if available). If the LLM can’t execute (for example, due to misconfiguration), the failure is logged and the conversation proceeds normally without guardrail evaluation.

Guardrail rules can grow quickly, especially when you’re covering multiple policies. To keep things readable, you can assign a custom name to each rule (and any rule grouping, if used) so it’s obvious what it’s checking at a glance.

Click the rule’s header label (where it currently shows “Rule”) and enter a descriptive name (e.g., for an Input guardrail grouping, enter a rule called "Prompt injection" or "PII masking", etc.)

Enforcement & overrides

When a guardrail rule is triggered, you may choose how the application should respond. Begin by selecting an override type when setting up your guardrail rule:

Override: Replace the original message with a safe alternative
Mask: Allow the message through but redact sensitive parts
Redirect: Route the conversation to a specific flow (e.g., escalation, error recovery)
Flag: Log the violation but let the message pass unchanged

In addition to detecting and enforcing a guardrail rule, you can optionally trigger analytics tracking or modify conversation state when a rule is activated. These actions run only when the guardrail rule is triggered.

Analytics tags: Toggle ON to assign one or more analytics tags in your workspace
State modifications: Toggle ON to assign one or more state modifications that may be applied to any to any variables (context variables, data request, slots, system)

Test guardrails

Before applying a guardrail to a live application, you can test how its rules behave against a sample user input. This makes it easier to validate detection logic and compare rule behavior.

Use the Test feature to simulate a message against your guardrail rules:

Select an existing guardrail
Click the Test (play) icon in the upper right
In the test panel, enter a sample user message > Click Run test

The test evaluates the message against each rule in the guardrail and displays the outcome for every rule in the results panel:

Clear: The input did not violate that rule
Violation: The rule was triggered for the test input

Review activity

Each guardrail resource includes a Logs tab to provide full traceability on one or more rules:

View when the guardrail rule was triggered
Inspect the original message and final enforced output
Monitor patterns of misuse, sensitive disclosures, or policy violations

Using guardrails

Once a guardrail is created in Resources, you can activate it for any application:

Select Applications and choose or create an app
On the Configuration tab of your app, choose one or more workspace guardrails
Click Save
Create a new build of your app and deploy

Once assigned, guardrails run automatically on every turn of the conversation. Any input guardrails check incoming messages, while output guardrails check outbound responses from the system.

Order of operations

When multiple guardrails and rules are applied, NLX evaluates them in a predictable order:

Guardrails attached to the application run in order. If you attach multiple guardrails to an application, NLX evaluates them in the same order they appear in the application’s Guardrails list. The platform checks all rules inside the first guardrail before moving on to the next guardrail group
Rules within a guardrail group run top to bottom. Within a guardrail group, rules execute from top to bottom in the order they appear in the UI
- All rules are evaluated and any triggered rules are logged
- Only one corrective action is applied. The corrective action comes from the first rule (top-most) that triggers

Deactivate or activate a rule

Sometimes you want to keep a rule around without deleting it (for testing, temporary policy changes, or troubleshooting). You can toggle a rule on/off at any time:

Go to Guardrails in the Resources menu
Choose a rule and in the rule header, click the three-dot menu and select Deactivate

Deactivated rules remain saved in the guardrail, but they are not evaluated at runtime until reactivated.

Last updated 16 minutes ago

hashtagWhat are Guardrails?

hashtagGuardrail types

hashtagDetection methods

hashtagEnforcement & overrides

hashtagTest guardrails

hashtagReview activity

hashtagUsing guardrails

hashtagOrder of operations

hashtagDeactivate or activate a rule