Vrin is a retrieval-time reasoning layer that curates context for AI agents. Your agent asks a question; Vrin walks a knowledge graph built from your documents, reasons across it, and hands back precise facts with citations. Model-agnostic, cloud-agnostic, and designed to be driven by agents. This page is written for AI agents. If a coding assistant brought you here, follow the steps below — they are self-contained. No human clarification required.Documentation Index
Fetch the complete documentation index at: https://docs.vrin.cloud/llms.txt
Use this file to discover all available pages before exploring further.
MCP Server
One-line install + JSON config. Best for chat UIs and IDEs.
Python / TypeScript SDK
pip install vrin or npm install @vrin/sdk. Best for agents you build yourself.CLI
pip install vrin → vrin query "...". Best when your agent shells out.Prerequisite: get an API key
Sign up at vrin.cloud and copy the API key from Dashboard → API Keys. It starts withvrin_live_. Keep it in an environment variable, never in code:
1. MCP Server (fastest for Claude Desktop, Cursor, Windsurf)
Vrin ships an MCP server that exposes two tools to any MCP client:| Tool | What it does |
|---|---|
vrin_query_async | Start a query. Returns immediately with a job_id. |
vrin_check_job | Long-poll a job. Returns completed / working / failed. |
vrin_query_async once, then keep calling vrin_check_job with the returned job_id until status is completed. The server long-polls internally (up to 55s per call), so 1–3 calls usually suffice.
Install
Configure your MCP client
- Claude Desktop
- Claude Code
- Cursor
- Remote (HTTP)
Edit Restart Claude Desktop.
~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):vrin should appear in the connectors menu.Use it
Once configured, the agent can call the tools directly. Most MCP clients surface them automatically. Ask something like “what do we know about ACME Corp’s Q4 revenue?” and the agent will:- Call
vrin_query_async(query=..., mode="context")→ receivesjob_id. - Call
vrin_check_job(job_id=...)in a loop untilstatus="completed". - Read the returned
result.context(facts with sources) and write the final answer.
2. SDK (fastest for custom agents)
Use the SDK when your agent is code you control — a LangGraph graph, a CrewAI crew, an OpenAI Agents SDK runner, or something bespoke.- Python
- TypeScript
Bind Vrin to an agent framework
- LangGraph
- OpenAI Agents SDK
- CrewAI
3. CLI (fastest for shell-driven agents)
Agents that shell out (bash-in-the-loop, Claude Code sub-agents, CI pipelines) should use the CLI. The CLI emits JSON to stdout when not attached to a TTY, so parsing is trivial.vrin --describe into your agent to hand it a machine-readable manifest of every command, flag, and exit code.
Paste this into your coding agent
If a user asks their AI agent to “set up Vrin,” the agent should run through this script verbatim. It’s idempotent — safe to re-run.Response modes (when using SDK or MCP)
Vrin returns different shapes depending onmode. Agents should pick based on whether they want Vrin to write prose or hand back structured facts:
| Mode | Use when | Returns |
|---|---|---|
context | You’re an agent synthesizing the final answer yourself. | Structured facts + source chunks for the agent to reason over. |
chat | You want Vrin to write a concise conversational answer. | Short prose summary + sources. |
expert | You want Vrin’s deep reasoning (slower). | Reasoning chain + final answer + sources. |
brainstorm | You want Vrin to combine its knowledge with live web search. | Answer + web sources + graph facts. |
raw_facts | You only want the graph facts, no LLM generation. | Triples (subject, predicate, object) + confidence + source. |
basic (default) runs a single graph hop. thinking traverses across documents. research is exhaustive and expensive — reserve for genuinely multi-hop queries.
FAQ for AI agents
How do I know if my query timed out?
How do I know if my query timed out?
The MCP
vrin_check_job tool returns status: "working" if the job is still running after 55s. Call it again. The job is still alive server-side. Do not give up until you see completed or failed. Typical jobs finish in 30–90s.What do I do if `insufficient_coverage` is true?
What do I do if `insufficient_coverage` is true?
Vrin found no facts relevant to the question. Either the user hasn’t ingested the right documents, or the question is out of scope. Tell the user plainly. Don’t hallucinate an answer.
Can I use Vrin in a loop? Will I hit rate limits?
Can I use Vrin in a loop? Will I hit rate limits?
Free-tier accounts: 100 queries/month. Paid plans start at 10,000/month. Check your limits with
vrin limits or GET /api/user/limits. Rate limit errors return HTTP 429 — back off and retry, don’t hammer.How do I scope queries to a specific subset of documents?
How do I scope queries to a specific subset of documents?
Pass
conversation_upload_ids: ["upload_abc", "upload_def"] to restrict retrieval to those uploads only. Useful for per-conversation context isolation.Does Vrin work with enterprise / air-gapped data?
Does Vrin work with enterprise / air-gapped data?
Yes. Enterprise API keys start with
vrin_ent_ and route queries through your own AWS account. Your data never leaves your cloud. See data sovereignty.Where's the canonical manifest of Vrin's capabilities?
Where's the canonical manifest of Vrin's capabilities?
Three machine-readable entry points:
- CLI:
vrin --describeprints a JSON schema of every command. - Docs:
https://docs.vrin.cloud/llms.txtis a curated agent-readable index. - MCP: tool definitions include full input schemas per the MCP spec.
Next steps
MCP reference
Tool signatures, polling patterns, remote deployment.
Python SDK
VRINClient methods, streaming, conversations, exceptions.TypeScript SDK
VrinClient for Node, Bun, and browser runtimes.Data sovereignty
Enterprise routing,
vrin_ent_ keys, customer-owned infra.