Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.vrin.cloud/llms.txt

Use this file to discover all available pages before exploring further.

The VrinClient exposes three query methods, each tuned for a different integration pattern:
MethodReturnsUse when
query()QueryResult (resolves once complete)You just want the final answer.
queryStream()AsyncGenerator<string> (content deltas)You want token-by-token streaming.
queryEvents()AsyncGenerator<StreamEvent> (full SSE frames)You want progress, sources, and metadata, not just text.

client.query(options)

const result = await client.query({
  query: "What is ACME's Q4 revenue?",
  mode: "chat",               // "chat" | "expert" | "brainstorm" | "context" | "raw_facts"
  depth: "thinking",          // "basic" | "thinking" | "research"
  model: "gpt-5.2",
  sessionId: "sess_abc",
  maintainContext: true,
  includeSummary: true,
  webSearchEnabled: false,
  conversationUploadIds: ["upload_123"],
});

result.summary;       // "ACME Corp reported $50M in Q4 2025..."
result.session_id;    // "sess_abc" — echoed back for multi-turn
result.total_facts;   // 12
result.sources;       // [{ title, chunk_id, score, ... }, ...]
result.entities;      // ["ACME Corp", "Jane Smith"]

QueryOptions fields

FieldTypeDefaultDescription
querystringrequiredThe natural-language question.
modeResponseMode"chat"Output shape. See response modes.
depthQueryDepthbackend-chosen"basic", "thinking", or "research". Only applies in context mode.
modelstringbackend-chosenLLM override, e.g. "gpt-5.2", "claude-4-haiku". Must be in your plan’s allowed models.
sessionIdstringContinue a multi-turn conversation.
maintainContextbooleanfalseIf true, a new session_id is created and returned.
includeSummarybooleantrueSet false to skip LLM generation and return raw facts only.
webSearchEnabledbooleanfalseAugment retrieval with live web search (brainstorm mode).
conversationUploadIdsstring[]Scope retrieval to specific uploads.

client.queryStream(options)

Yields content deltas as strings. Use for chat UIs.
for await (const token of client.queryStream({ query: "Summarize ACME's year" })) {
  process.stdout.write(token);
}

client.queryEvents(options)

Yields raw SSE events. Agents that render progress, show sources, or trace reasoning should use this.
for await (const event of client.queryEvents({ query: "..." })) {
  switch (event.type) {
    case "progress": {
      const d = event.data as { stage: string; step: number; total_steps: number };
      console.log(`[${d.step}/${d.total_steps}] ${d.stage}`);
      break;
    }
    case "content": {
      const d = event.data as { delta: string };
      process.stdout.write(d.delta);
      break;
    }
    case "sources": {
      const d = event.data as { sources: Array<{ title: string }> };
      console.log("\nSources:", d.sources.map((s) => s.title));
      break;
    }
    case "done":
      return;
    case "error": {
      const d = event.data as { message: string };
      throw new Error(d.message);
    }
  }
}

Event types

typePayload fields
progressstage, label, step, total_steps, elapsed_ms
metadatasession_id, total_facts, total_chunks, entities, model
contentdelta (partial token text)
reasoningchains or steps (expert mode)
sourcessources: [{ title, chunk_id, score }]
done
errormessage

Response modes

See the agent-facing mode table for guidance on which to pick. Short version:
  • chat — concise prose answer. Default for chat UIs.
  • context — structured facts for an agent to synthesize. Default for agent tools.
  • expert — includes reasoning chain. Slower, more expensive.
  • brainstorm — combines Vrin + web search. Use with webSearchEnabled: true.
  • raw_facts — no LLM generation; just the graph facts.