Skip to main content
Streaming changes response delivery. It does not remove the need for request, tool, tool-result, and output checks.

Supported Streaming Calls

The OpenAI wrappers support these streaming paths:
  • client.responses.create({ stream: true, ... })
  • client.responses.stream(...)
  • client.chat.completions.create({ stream: true, ... })
In Python, use the native OpenAI Python call shape:
  • client.responses.create(stream=True, ...)
  • client.responses.stream(...)
  • client.chat.completions.create(stream=True, ...)

Before the Stream Opens

Before opening the provider stream, Averta can:
  • evaluate the request
  • filter blocked tools
  • evaluate pending tool results in continuation calls
If the request or tool result is blocked, the wrapper throws before the provider stream starts.

During the Stream

The wrapper checks streamed text as it is produced. If the output is allowed, your app consumes the provider stream normally. If policy blocks streamed content, the wrapper fails closed instead of continuing to emit unsafe text.

Rewrite Limit

Streaming output rewrite is not supported yet. A rewrite decision during streaming is treated as a failure. Use non-streaming calls when you need Averta to rewrite final output.

Current Limits

SurfaceLimit
Responses streamingOne output text stream per response.
Chat Completions streamingOne streamed text choice.
Output rewriteNon-streaming only.

Debugging

SymptomLikely cause
Stream never opensRequest, tool exposure, or tool-result checkpoint blocked before provider execution.
Chat stream fails with multiple choicesStreaming with n > 1 is unsupported.
Stream stops when output policy triggersStreaming rewrite is unsupported, so the wrapper fails closed.

Responses API

See Responses streaming entry points.

Output checks

Use non-streaming calls for rewrite behavior.