StreamingResponse wraps a Server-Sent Events (SSE) stream and yields content tokens as they arrive from the backend. It is returned by client.query(..., stream=True).
Basic usage
Accessing metadata
Metadata (sources, fact counts, entities) is populated during iteration and available after the stream completes:Properties
The complete generated text, accumulated from all content deltas.
Conversation session ID, if conversation context was maintained.
Full metadata dict from the backend.
Source documents referenced in the answer.
Reasoning chain steps (populated in
thinking and research modes).Entities identified in the query and used for graph traversal.
Number of knowledge graph facts used to generate the answer.
Number of vector search chunks used.
The LLM model that generated the response.
Time spent on retrieval (graph + vector search).
Error message if the stream encountered an error.
True if the knowledge base had no relevant facts for the query.to_dict()
Convert the completed stream into a dict matching the non-streaming response format:Context manager
StreamingResponse supports the context manager protocol for explicit cleanup:
SSE event types
The stream delivers these event types internally:| Event | Description |
|---|---|
content | Text delta — yielded to the iterator |
metadata | Session ID, fact/chunk counts, entities, model info |
reasoning | Thinking steps and reasoning chains |
sources | Source document references |
done | Stream complete, may include final text or error |
error | Error occurred — raises StreamingError |
StreamingResponse processes them and exposes the data through properties.