query()
Query the knowledge base with natural language. Returns either a complete result dict or a streaming response.Parameters
Natural-language question to answer.
If
True, return a StreamingResponse that yields tokens as they arrive.Controls answer depth and reasoning style.
| Mode | Description |
|---|---|
"chat" | Concise, direct answers. Fastest. |
"thinking" | Includes reasoning chains and cross-document analysis. |
"research" | Exhaustive multi-hop research with parallel strategies. Slowest. |
Override the retrieval depth independently of response mode. Values:
"basic", "thinking", "research".LLM model override (e.g.
"gpt-4o"). Uses the default cost-efficient model if not specified.Explicit conversation session ID to continue. Use this or
maintain_context, not both.If
True, maintain conversation state across queries. The client tracks the session ID automatically.If
True, include the AI-generated summary. Set to False for raw fact retrieval.Enable web search augmentation for questions that may need external information.
List of upload IDs to include as additional context for this query.
Non-streaming response
Streaming response
Response modes
query_facts()
Fast fact retrieval without AI summary generation. Returns the same dict structure but with raw graph facts and vector chunks only.Parameters
Natural-language question.
Maximum number of results to return.
Returns
Same dict structure asquery() but summary will be empty or minimal since no LLM generation occurs.
Insufficient coverage
When the knowledge base has zero relevant facts for a query, Vrin returns early without calling the LLM:resp.insufficient_coverage after iteration completes.