binaryStreamable HTTP

The Streamable HTTP protocol allows you to consume AI app responses in real-time. Instead of waiting for the entire response to generate (which can take seconds for complex Generative AI tasks), this API streams the response chunks as they become available.

When to use this API

brain-circuit

Generative AI experiences

When your AI app generates long-form content, streaming reduces the "Time to First Byte" perception, making the app feel faster

timer

Real-time UIs

For typing indicators or "ghost text" effects, similar to ChatGPT

How it works

This protocol utilizes Server-Sent Events (SSE). You use the exact same endpoint and authentication as the standard REST API, but you enable the stream flag in your request.

Endpoint configuration

  • URL: https://apps.nlx.ai/c/{deploymentKey}/{channelKey}-{languageCode}

  • Header: nlx-api-key: YOUR_KEY

  • Content-type: application/json

circle-info

Ensure your URL ends with the language code (e.g., -en-US).

Enabling the Stream

Add "stream": true to the root of your JSON body.

curl -N -X POST "https://apps.nlx.ai/c/xxxx/xxxx-en-US" \
  -H "nlx-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "stream": true,
    "request": {
      "unstructured": {
        "text": "Write a short poem about coding."
      }
    }
  }'

Consuming the stream

The server will respond with Content-type: text/event-stream. Data is sent in chunks prefixed with data:.

JavaScript client cxample

In a browser or Node.js environment, you handle the stream by reading the response reader.

Last updated