Skip to main content
StreamingResponse wraps a Server-Sent Events (SSE) stream and yields content tokens as they arrive from the backend. It is returned by client.query(..., stream=True).

Basic usage

for token in client.query("Summarize Q4 results", stream=True):
    print(token, end="", flush=True)

Accessing metadata

Metadata (sources, fact counts, entities) is populated during iteration and available after the stream completes:
resp = client.query("What is ACME's revenue?", stream=True)

for token in resp:
    print(token, end="", flush=True)

# Available after iteration
print(resp.full_text)
print(resp.total_facts)
print(resp.total_chunks)
print(resp.sources)
print(resp.entities)

Properties

full_text
str
The complete generated text, accumulated from all content deltas.
session_id
Optional[str]
Conversation session ID, if conversation context was maintained.
metadata
Dict[str, Any]
Full metadata dict from the backend.
sources
List[Dict[str, Any]]
Source documents referenced in the answer.
thinking_steps
List[str]
Reasoning chain steps (populated in thinking and research modes).
entities
List[str]
Entities identified in the query and used for graph traversal.
total_facts
int
Number of knowledge graph facts used to generate the answer.
total_chunks
int
Number of vector search chunks used.
model
Optional[str]
The LLM model that generated the response.
search_time
Optional[str]
Time spent on retrieval (graph + vector search).
error
Optional[str]
Error message if the stream encountered an error.
insufficient_coverage
bool
True if the knowledge base had no relevant facts for the query.

to_dict()

Convert the completed stream into a dict matching the non-streaming response format:
resp = client.query("What is ACME's revenue?", stream=True)
for token in resp:
    pass  # consume the stream

result = resp.to_dict()
# Same shape as client.query("...", stream=False)

Context manager

StreamingResponse supports the context manager protocol for explicit cleanup:
with client.query("Summarize results", stream=True) as resp:
    for token in resp:
        print(token, end="", flush=True)
# Stream closed automatically

SSE event types

The stream delivers these event types internally:
EventDescription
contentText delta — yielded to the iterator
metadataSession ID, fact/chunk counts, entities, model info
reasoningThinking steps and reasoning chains
sourcesSource document references
doneStream complete, may include final text or error
errorError occurred — raises StreamingError
You do not need to handle these directly — StreamingResponse processes them and exposes the data through properties.