Agentic Voice+

Create a bidirectional agent for multimodal voice-driven experiences using the NLX platform

Bidirectional Voice+ agents create handsfree, multimodal experiences across web and mobile apps. Users can speak naturally while the AI assistant answers questions, navigates pages, fills forms, and triggers custom actions on the frontend in real time.

The agent uses tools, frontend context, and configured actions to decide how to respond and what to do next.

For example, a hotel guest might say:

"I want to switch to the garden suite."

The Voice+ agent can understand the request, compare the guest's message to available room options, select the correct room ID, and send a structured command to the frontend so the room selection updates automatically.

What you'll build

In this guide, you'll create a bidirectional Voice+ agent that can:

  • Answer user questions using a Knowledge base

  • Use agentic tools

  • Navigate pages, fill forms, and trigger custom frontend actions

  • Install NLX Touchpoint with bidirectional command handlers

  • Use the NLX Context API to understand what the user is currently viewing

  • Handle structured action payloads received via NLX Touchpoint

Step 1: Create the Voice+ agent flow

  1. Select Resources > Flows in your workspace menu

  2. Click New flow > Enter a descriptive flow name with no spaces or special characters

  3. Click Create flow

  4. On the Canvas, add a Voice+ node and link to the Start node

  5. Select the Voice+ node and choose Agentic mode

Agentic mode enables bidirectional Voice+ experiences in web or mobile apps through the NLX Touchpoint SDK. In this mode, users can speak naturally and the AI can take on-screen actions through configured tools and frontend commands.

Step 2: Write the agent instructions

In the Voice+ node's configuration panel, provide a clear prompt that explains the agent's role, sole task(s), tone, and boundaries.

Good Voice+ agent prompts should include:

  • The assistant's role and persona

  • The only task(s) the agent is responsible for

  • What the agent should avoid or not permit

  • Tone and brevity expectations

  • Any special instructions for multimodal or voice-based interactions

  • Context variables to pass to the agent, such as Customer profile: {customerProfile}

  • Select the LLM model powering the reasoning and messaging output

Step 3: Add tools to the Voice+ agent

Tools give the agent the ability to retrieve information, answer questions, execute workflows, or collect structured details.

Tool scenarios

Use the following examples to decide which tools your agent needs:

  • If the frontend already loads available rooms, use a custom action like select_room to receive the list of available rooms as context and pass the selected room ID

  • If the agent needs to retrieve a profile from a backend service, attach a Data request or Flow as a tool

  • If the agent needs to answer approved questions, attach a Knowledge base

  • If the agent needs to collect structured user details for Data requests, attach slots to the flow's settings and assign them under Data capture

In the Voice+ node configuration panel, expand Tools and assign one or more resources from your workspace.

Applicable tool types for bidirectional Voice+ include:

  • Custom Data requests

  • Managed Data requests

  • Knowledge bases

  • Flows

  • Data capture

Optional tool configurations

Interim message

An interim message is a short message the agent delivers while an applicable tool is running, such as:

One moment while I look that up.

This is useful when a tool call may take a few seconds to complete, so users know the agent is working.

To add one, assign a tool, expand its options, and enter one or more interim messages.

Scope tags

Scope tags restrict a tool so the agent can only use it on specific pages or UI states in your frontend.

This is helpful when a tool is only relevant in certain areas or could conflict with other page-specific actions. For example, a close_modal tool may listen for phrases like "cancel" or "exit," which could conflict with different actions on a checkout page.

Custom or managed Data requests

Use Data requests when the agent needs to call an API or service to retrieve or send information that is not already handled by your frontend.

If your frontend already has access to the information or already performs the action, you do not need to set up and attach a Data request tool.

In that case, configure a custom action on the Voice+ node that allows the agent to pass the selected value or command to your app.

Examples

Scenario
Recommended setup

The frontend already has a list of available inventory and only needs the agent to pass the user's selection

Use a custom action instead of a Data request (see Step 4)

The agent should look up weather data before making a recommendation

Use a Data request

The agent needs to retrieve a customer profile from a backend service

Use a Data request or MCP-enabled flow

The agent needs to check availability before showing options

Use a Data request or MCP-enabled flow

If attaching a custom Data request previously set up in the workspace, provide:

  • A brief tool prompt explaining what it is, its purpose, and when to call it

  • Descriptions for any Request model schema properties, or directly add variables to pass during the call using the placeholder menu: enter an open curly bracket to select a variable {

Knowledge bases

Attach a Knowledge base when the agent should answer questions from approved content.

Examples:

  • Product or service details

  • Policies and procedures

  • Pricing or plan information

  • FAQs and troubleshooting guidance

  • Location, hours, or contact information

When attaching a Knowledge base, provide a brief tool prompt explaining that it's a knowledge base and when to call it.

Flows as tools

Agentic Voice+ nodes support two ways to use flows as tools:

  • MCP-enabled flows

  • Handover flows

MCP-enabled flow

Use an MCP-enabled flow when the agent should remain in control while calling a structured workflow.

This is usually the preferred flow-tool pattern for bidirectional Voice+, because the agent can call the flow, receive the result, and continue supporting handsfree navigation, form fill, and custom actions.

Examples:

  • Send a confirmation message

  • Read legal statements and collect consent

  • Retrieve weather disruption details

Handover flow

Use handover flows sparingly in bidirectional Voice+ experiences.

A handover flow temporarily transfers control away from the Voice+ agent. While the handover flow is running, bidirectional Voice+ agent behavior is paused until the conversation returns to the agent node. This means the agent will not orchestrate handsfree actions, tools, or page-aware behavior during that time.

Use a handover flow only when you intentionally want the user to enter a deterministic flow that takes over the conversation.

At the end of the handover flow, add a Redirect node back to the flow containing the Voice+ node.

On the Redirect node:

  1. Select Flow > Choose the destination flow where the Voice+ agent node lives

  2. Choose Select target under the destination flow > Select Custom node

  3. Enter the Voice+ node ID from the original flow

You can copy the Voice+ node ID from the node's three-dot menu.

Data capture

Use Data capture when the agent needs to extract slot variables from the user for use with Data requests where the details are sent in a payload.

Use this when the agent needs to collect structured values like:

  • Guest name

  • Confirmation number

  • Room preference

  • Checkout date

  • Email address

Attach slots first within the flow's Settings to select and attach them to the Data capture section of the Voice+ agentic node.

Within the node's Instructions section, provide guidance on how the agent should extract slot information from the user.

Example:

Step 4: Add agent actions

Actions tell the agent what it can do on your frontend, such as navigating to a page, filling form fields, or triggering custom UI behavior.

Custom actions are useful for frontend behaviors that are specific to your application, such as selecting a room, opening a modal, filtering results, checking a box, or highlighting an option.

When configuring a custom action, provide:

  • Action name: A stable identifier your frontend can listen for, such as select_room

  • Prompt: A clear description of when the agent should use the action

Optional configurations

  • Input schema: The context or available options the agent should consider, passed from the frontend

  • Output schema: The structured value the agent should return to the frontend

  • Scope tags: Tags that help determine which actions and tools apply on the current page or UI state

The custom action names and schemas configured on the Voice+ node should match what your frontend expects. For example, if your Voice+ node defines a custom action called select_room, your frontend should register and handle a select_room command. Likewise, the input and output schemas should align with the data structures used in your frontend code.

Scope tags are especially important for bidirectional Voice+ because they prevent actions from being used in the wrong place. For example, a select_room action should only be available when the user is on the room selection screen, while a modify_checkout_date action may only apply on the checkout or stay modification screen.

Example custom action: select_room

Use this action when the guest asks to select a room from available options.

Field
Value

Action name

select_room

Action prompt

Use this action to select the ID of the room from the available options based on what the user said. For example, "Garden Suite" should map to the matching room ID.

Scope tag

room_selection

Input schema

The input schema tells the agent what room options are available.

Input schema description: The list of available rooms. Each room includes an ID and room name.

Output schema

The output schema tells the frontend what value the agent will return.

Output schema description: Room ID of the selected room.

How it works

When the user says:

"I'll take the Garden Suite."

The agent can return:

Your frontend can then use that value to highlight and select the correct room on the page.

Step 5: Configure Voice+ node paths

The Voice+ node can route to different paths depending on how the agent session ends — whether the agent succeeds, times out, fails, or receives a specific user request.

Exit conditions

Exit conditions define the outcomes that allow the agent to leave the Voice+ node and proceed through the rest of the flow.

Use exit conditions for meaningful conversation outcomes that should route differently after the Voice+ experience.

Example conditions:

  • Goodbye: The user clearly indicates they are done, no longer need help, or are ending the conversation

  • Escalation requested: The user specifically asks to speak to a human agent

For each exit condition:

  1. Add the exit condition in the Voice+ node.

  2. Give it a short clear name.

  3. Add a description that tells the agent when to use it.

  4. Link the exit path to the appropriate next node in the flow.

Example

Field
Value

Exit condition name

Goodbye

Description

Call this path when the user clearly indicates they are done, no longer need help, or are ending the conversation, such as saying goodbye or expressing completion.

You might route this path to a final Basic node that says:

Thanks for contacting us. Have a great day.

Timeout

Use the Timeout path when the agent does not respond within the configured timeout period.

Example:

I'm sorry, something took too long. Let me try again.

Failure

Use the Failure path when the agent or its tools cannot connect or complete as configured.

Example:

I'm having trouble accessing that information right now.

Voice+ Script paths

The following pathways are related to Voice+ Scripts' steps, which are controlled predefined voice lines that trigger when certain UI elements load on your frontend. Voice+ scripts are configured in your workspace Resources. Check out the guide on Voice+ scripts in the course for more information.

Continuation

Use the Continuation path only when your Voice+ experience also uses Voice+ script steps, which are set up as Voice+ Scripts in the Resources menu.

This path corresponds to the Continue action configured on a Voice+ script step. When that scripted step is reached, the experience proceeds from the Continuation edge of the Voice+ node.

Escalation

Use the Escalation path only when your Voice+ experience also uses Voice+ script steps.

This path corresponds to the Escalate action configured on a Voice+ script step. When that scripted step is reached, the experience proceeds from the Escalation edge of the Voice+ node, typically to begin an escalation process or route to an Escalation node.

Step 6: Set up the app

  1. Select Applications in your workspace menu

  2. Click New application

  3. Enter a name for your application and click Create

Your app opens on the Configuration tab. This tab defines your AI engine, delivery channel, guardrails, and connected workflows.

AI engine

An AI engine helps interpret user speech and package the application whenever a new build is created.

Keep the built-in NLX model for a seamless setup.

Delivery

Every custom core application includes an API channel with native Voice+ support for web and mobile implementation.

Open the API channel settings:

  1. Whitelist your frontend domain (including [IP_ADDRESS]) to prevent CORS errors under the General tab

  2. Choose the Voice tab

  3. Select Amazon Polly or ElevenLabs as the TTS provider and choose a voice persona

  4. Click Update channel

Functionality and default behavior

Attach the flow that will launch your Voice+ agent:

  1. Select Default behavior and assign the flow to the Welcome default

  2. Click Save

Step 7: Deploy the app

Creating a build of the application creates a package that captures the current state of all resources comprising your application. Then, a successful deployment pushes that build to all delivery channels defined on your application.

To build and deploy:

  1. Click the deployment status in the upper-right corner of your app

  2. Choose Build and deploy

  3. Review the validation results > Resolve any critical issues

  4. Click Build (enter a description of modifications as a changelog, if desired)

  5. Choose Deploy on the successful build

  6. Confirm again by clicking Deploy

After deployment, select your app's API channel under Delivery

From the API channel Setup instructions tab, keep note of:

  • Application URL

  • API key

Step 8: Install Touchpoint

Use the Application URL and API key from your deployed app's API channel setup details.

Touchpoint is the frontend layer that enables voice input, bidirectional Voice+ behavior, and command handling in your web or mobile app.

In your frontend setup:

  1. Install the Touchpoint SDK

  2. Configure the SDK with your application URL and API key

  3. Set input to voiceMini

  4. Enable bidirectional command handling

  5. Register handlers for custom actions, navigation, and form fill

Example Touchpoint setup

Your frontend team should map each action returned by the Voice+ agent to the correct UI behavior.

Step 9: Register bidirectional commands

For bidirectional Voice+ actions to work, your frontend needs to listen for the commands the agent may send through Touchpoint.

  • A component such as ChatWidgetBootstrap.tsx can initialize Touchpoint and configure bidirectional command handling.

  • A file such as bidiCommands.ts can define the functions that execute when a bidirectional command is received.

Example custom command handler

The SDK registers the event and passes the command to your frontend. What happens next (such as updating state, opening a modal, selecting an option, or changing the page) is handled by your frontend code.

For example, a page component like ModifyStayPage.tsx could listen for the relevant event and run the UI logic needed to update the screen.

Step 10: Send frontend context with the NLX Context API

The NLX Context API lets your frontend asynchronously send page context to an active Voice+ conversation.

This helps the Voice+ agent understand:

  • The current page or route

  • Available form fields

  • Current field values

  • Valid navigation destinations

  • Available frontend actions

  • Scope tags for the current screen

Unlike journey tracking, which tracks scripted steps, the Context API sends real-time page metadata so the Voice+ agent knows what the user is looking at.

Your frontend should send context whenever the page or UI state changes, including when:

  • The user navigates to a new page

  • A form appears

  • Field values change

  • New actions become available

  • Scope tags change

Context API endpoint

Use the Conversation API context endpoint:

Header:

Context API request body

The request must include:

The nlx:vpContext object can include:

  • uri

  • fields

  • actions

  • destinations

  • scopes

Example: Send room selection context

In this example, the frontend tells the Voice+ agent that the user is on the room selection page and that three rooms are available.

When this context is active, the user can say:

"Switch me to the Ocean View room."

The agent can identify the matching room and send the correct action result to the frontend.

Example: Send form field context

In this example, the frontend tells the Voice+ agent which checkout form fields are available.

When the user says:

"Change my checkout date to Friday and pick the Garden Suite."

The Voice+ agent can use the field context to understand which values should be filled or selected.

Best practices

Describe every tool clearly

The agent relies on tool descriptions and schema property descriptions to decide:

  • Whether a tool is relevant

  • What information to collect

  • What values to pass

  • What order to execute tools in

Use scope tags to prevent actions in the wrong place

Scope tags help prevent the agent from trying to select a room while the user is on an unrelated page.

Example:

Then configure the related action to only apply when the room_selection scope is active.

Keep Voice+ action schemas aligned with frontend code

The Voice+ node and frontend should use the same command names and data structures.

Voice+ node
Frontend

Action name: select_room

Handler listens for: select_room

Output schema: room ID as a number

Handler expects: room ID as a number

Send context whenever the page changes

Your frontend should send updated Context API payloads when:

  • The user navigates to a new page

  • A form appears

  • Field values change

  • New actions become available

  • Scope tags change

This keeps the Voice+ agent aligned with the user's current scree

Last updated