Vrin for AI Agents

Vrin is a retrieval-time reasoning layer that curates context for AI agents. Your agent asks a question; Vrin walks a knowledge graph built from your documents, reasons across it, and hands back precise facts with citations. Model-agnostic, cloud-agnostic, and designed to be driven by agents. This page is written for AI agents. If a coding assistant brought you here, follow the steps below — they are self-contained. No human clarification required.

MCP Server

One-line install + JSON config. Best for chat UIs and IDEs.

Python / TypeScript SDK

pip install vrin or npm install @vrin/sdk. Best for agents you build yourself.

CLI

pip install vrin → vrin query "...". Best when your agent shells out.

Prerequisite: get an API key

Sign up at vrin.cloud and copy the API key from Dashboard → API Keys. It starts with vrin_live_. Keep it in an environment variable, never in code:

export VRIN_API_KEY=vrin_live_your_api_key

Agents that need a fresh key programmatically can hit POST https://api.vrin.cloud/api/auth/signup with {email, password} and receive a key in the response. See the auth reference.

1. MCP Server (fastest for Claude Desktop, Cursor, Windsurf)

Vrin ships an MCP server that exposes two tools to any MCP client:

Tool	What it does
`vrin_query_async`	Start a query. Returns immediately with a `job_id`.
`vrin_check_job`	Long-poll a job. Returns `completed` / `working` / `failed`.

The async pattern exists because Vrin’s deep reasoning takes 30–120 seconds. You call vrin_query_async once, then keep calling vrin_check_job with the returned job_id until status is completed. The server long-polls internally (up to 55s per call), so 1–3 calls usually suffice.

Install

pip install vrin-mcp-server

Requires Python 3.9+.

Configure your MCP client

Claude Desktop
Claude Code
Cursor
Remote (HTTP)

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "vrin": {
      "command": "python",
      "args": ["-m", "vrin_mcp.server"],
      "env": {
        "VRIN_API_KEY": "vrin_live_your_api_key"
      }
    }
  }
}

Restart Claude Desktop. vrin should appear in the connectors menu.

From your project root:

claude mcp add vrin python -m vrin_mcp.server \
  --env VRIN_API_KEY=vrin_live_your_api_key

Or paste into .mcp.json:

{
  "mcpServers": {
    "vrin": {
      "command": "python",
      "args": ["-m", "vrin_mcp.server"],
      "env": {"VRIN_API_KEY": "vrin_live_your_api_key"}
    }
  }
}

Edit ~/.cursor/mcp.json:

{
  "mcpServers": {
    "vrin": {
      "command": "python",
      "args": ["-m", "vrin_mcp.server"],
      "env": {"VRIN_API_KEY": "vrin_live_your_api_key"}
    }
  }
}

Reload Cursor. The vrin tools appear in the agent’s tool list.

Run vrin-mcp-http in Docker or a VM, then point any MCP-over-HTTP client at it:

{
  "mcpServers": {
    "vrin": {
      "url": "https://your-host.example.com/mcp",
      "auth": {"type": "bearer", "token": "vrin_live_your_api_key"}
    }
  }
}

See MCP guide → Remote deployment.

Use it

Once configured, the agent can call the tools directly. Most MCP clients surface them automatically. Ask something like “what do we know about ACME Corp’s Q4 revenue?” and the agent will:

Call vrin_query_async(query=..., mode="context") → receives job_id.
Call vrin_check_job(job_id=...) in a loop until status="completed".
Read the returned result.context (facts with sources) and write the final answer.

Full MCP reference lives at MCP → Overview.

2. SDK (fastest for custom agents)

Use the SDK when your agent is code you control — a LangGraph graph, a CrewAI crew, an OpenAI Agents SDK runner, or something bespoke.

Python
TypeScript

pip install vrin

from vrin import VRINClient

client = VRINClient(api_key="vrin_live_...")

# Synchronous query (blocks until answer is ready)
result = client.query("What is ACME's Q4 revenue?")
print(result["summary"])
for source in result.get("sources", []):
    print(f"  - {source['title']} → {source.get('chunk_id')}")

# Streaming (token-by-token)
for token in client.query("Summarize ACME's performance", stream=True):
    print(token, end="", flush=True)

npm install @vrin/sdk

import { VrinClient } from "@vrin/sdk";

const client = new VrinClient({ apiKey: process.env.VRIN_API_KEY! });

// Non-streaming
const result = await client.query({ query: "What is ACME's Q4 revenue?" });
console.log(result.summary);

// Streaming
for await (const token of client.queryStream({ query: "Summarize ACME" })) {
  process.stdout.write(token);
}

The TypeScript SDK ships an Async/Await surface that mirrors the Python client. See the TypeScript SDK reference.

Bind Vrin to an agent framework

LangGraph
OpenAI Agents SDK
CrewAI

from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool
from vrin import VRINClient

vrin = VRINClient(api_key="vrin_live_...")

@tool
def ask_knowledge_base(question: str) -> str:
    """Query the company knowledge base. Returns a cited answer."""
    r = vrin.query(question)
    if not r.get("summary"):
        return "No relevant information found."
    sources = ", ".join(s["title"] for s in r.get("sources", [])[:3])
    return f"{r['summary']}\n\nSources: {sources}"

agent = create_react_agent(model="gpt-5.2", tools=[ask_knowledge_base])

from agents import Agent, function_tool
from vrin import VRINClient

vrin = VRINClient(api_key="vrin_live_...")

@function_tool
def ask_knowledge_base(question: str) -> str:
    """Ask the company knowledge base a question."""
    return vrin.query(question)["summary"]

agent = Agent(
    name="support_agent",
    instructions="Use ask_knowledge_base for any factual question about the company.",
    tools=[ask_knowledge_base],
)

from crewai.tools import BaseTool
from vrin import VRINClient

class VrinKB(BaseTool):
    name: str = "vrin_knowledge"
    description: str = "Query the company knowledge base. Input: a question."
    _client: VRINClient = VRINClient(api_key="vrin_live_...")

    def _run(self, question: str) -> str:
        r = self._client.query(question)
        return r.get("summary", "No relevant information found.")

3. CLI (fastest for shell-driven agents)

Agents that shell out (bash-in-the-loop, Claude Code sub-agents, CI pipelines) should use the CLI. The CLI emits JSON to stdout when not attached to a TTY, so parsing is trivial.

pip install vrin
export VRIN_API_KEY=vrin_live_your_api_key

# Query — JSON when piped, pretty when interactive
vrin query "What is ACME's Q4 revenue?" --json

# Insert knowledge
vrin insert "ACME reported \$50M in Q4 2025" --title "ACME Q4" --json

# Upload a file
vrin upload ./earnings.pdf --json

Auto-discovery: pipe vrin --describe into your agent to hand it a machine-readable manifest of every command, flag, and exit code.

vrin --describe | jq .

Full CLI reference: CLI → Overview.

Paste this into your coding agent

If a user asks their AI agent to “set up Vrin,” the agent should run through this script verbatim. It’s idempotent — safe to re-run.

You are setting up Vrin for this codebase. Vrin is a retrieval-time
reasoning layer: the user's AI agents query Vrin to get cited context
before generating answers. Your job is to get it working end-to-end.

Execute in order. Do NOT skip steps. Do NOT ask the user questions that
are answered below.

1. CONFIRM THE USER HAS AN API KEY
   - Check env var $VRIN_API_KEY.
   - If missing, tell the user: "Sign up at https://vrin.cloud, copy
     your API key from Dashboard → API Keys, then run:
     export VRIN_API_KEY=vrin_live_..."
   - Do NOT proceed until the key is set.

2. PICK THE INTEGRATION STYLE
   Ask the user ONCE which they want:
     (a) MCP server — plugs into Claude Desktop / Cursor / Claude Code
     (b) SDK — they're building a Python or TypeScript agent
     (c) CLI — they want shell-scripted access
   Default to (a) if the user is using any MCP-capable editor.

3a. IF MCP: install + configure
    - pip install vrin-mcp-server
    - Locate the user's MCP client config file (claude_desktop_config.json
      for Claude Desktop; .mcp.json in project root for Claude Code;
      ~/.cursor/mcp.json for Cursor).
    - Add:
        {"mcpServers":{"vrin":{"command":"python","args":["-m","vrin_mcp.server"],
         "env":{"VRIN_API_KEY":"<the key>"}}}}
    - Tell the user to restart the client.
    - Verify by asking the user to run a test query.

3b. IF SDK: install + sample integration
    - Python:    pip install vrin
    - TypeScript: npm install @vrin/sdk
    - Add a module `vrin_client.py` (or `vrinClient.ts`) exposing a
      singleton initialized with process.env.VRIN_API_KEY.
    - If the user has an existing agent file (LangGraph, OpenAI Agents
      SDK, CrewAI, custom), wire Vrin in as a tool named
      `ask_knowledge_base` using the pattern at
      https://docs.vrin.cloud/agents#bind-vrin-to-an-agent-framework.
    - Do NOT over-abstract. One file, one helper, done.

3c. IF CLI: install + shell helper
    - pip install vrin
    - Add a Makefile target or package.json script:
        "vrin-query": "vrin query \"$QUERY\" --json | jq .summary"
    - Verify with: vrin health --json

4. INGEST KNOWLEDGE
   Ask the user which files/folders contain the knowledge the agent
   should learn from. For each:
     - If < 2MB text: vrin insert "<contents>" --title "<filename>"
     - If PDF/CSV/docx: vrin upload <path>
     - If many files: vrin bulk-insert <json-file>
   Confirm each job reached status=completed before moving on.

5. VERIFY WITH ONE END-TO-END QUERY
   Ask the user for a real question they'd want their agent to answer.
   Run it through the integration they picked in step 2. Show them the
   summary + sources. If sources are empty, tell them to ingest more.

6. DOCUMENT WHAT YOU DID
   Append a "Vrin integration" section to README.md with:
   - Which integration style is in use
   - Where VRIN_API_KEY must be set
   - How to re-run ingestion
   - Link to https://docs.vrin.cloud for the user to learn more.

Never hard-code the API key. Never log the key. Never commit it.

Response modes (when using SDK or MCP)

Vrin returns different shapes depending on mode. Agents should pick based on whether they want Vrin to write prose or hand back structured facts:

Mode	Use when	Returns
`context`	You’re an agent synthesizing the final answer yourself.	Structured facts + source chunks for the agent to reason over.
`chat`	You want Vrin to write a concise conversational answer.	Short prose summary + sources.
`expert`	You want Vrin’s deep reasoning (slower).	Reasoning chain + final answer + sources.
`brainstorm`	You want Vrin to combine its knowledge with live web search.	Answer + web sources + graph facts.
`raw_facts`	You only want the graph facts, no LLM generation.	Triples (subject, predicate, object) + confidence + source.

Depth basic (default) runs a single graph hop. thinking traverses across documents. research is exhaustive and expensive — reserve for genuinely multi-hop queries.

FAQ for AI agents

How do I know if my query timed out?

The MCP vrin_check_job tool returns status: "working" if the job is still running after 55s. Call it again. The job is still alive server-side. Do not give up until you see completed or failed. Typical jobs finish in 30–90s.

What do I do if `insufficient_coverage` is true?

Vrin found no facts relevant to the question. Either the user hasn’t ingested the right documents, or the question is out of scope. Tell the user plainly. Don’t hallucinate an answer.

Can I use Vrin in a loop? Will I hit rate limits?

Free-tier accounts: 100 queries/month. Paid plans start at 10,000/month. Check your limits with vrin limits or GET /api/user/limits. Rate limit errors return HTTP 429 — back off and retry, don’t hammer.

How do I scope queries to a specific subset of documents?

Pass conversation_upload_ids: ["upload_abc", "upload_def"] to restrict retrieval to those uploads only. Useful for per-conversation context isolation.

Does Vrin work with enterprise / air-gapped data?

Yes. Enterprise API keys start with vrin_ent_ and route queries through your own AWS account. Your data never leaves your cloud. See data sovereignty.

Where's the canonical manifest of Vrin's capabilities?

Three machine-readable entry points:

CLI: vrin --describe prints a JSON schema of every command.
Docs: https://docs.vrin.cloud/llms.txt is a curated agent-readable index.
MCP: tool definitions include full input schemas per the MCP spec.

Next steps

MCP reference

Tool signatures, polling patterns, remote deployment.

Python SDK

VRINClient methods, streaming, conversations, exceptions.

TypeScript SDK

VrinClient for Node, Bun, and browser runtimes.

Data sovereignty

Enterprise routing, vrin_ent_ keys, customer-owned infra.

Get Started

MCP Server

CLI

Python SDK

TypeScript SDK

Concepts

Guides

Vrin for AI Agents

MCP Server

Python / TypeScript SDK

CLI

Prerequisite: get an API key

1. MCP Server (fastest for Claude Desktop, Cursor, Windsurf)

Install

Configure your MCP client

Use it

2. SDK (fastest for custom agents)

Bind Vrin to an agent framework

3. CLI (fastest for shell-driven agents)

Paste this into your coding agent

Response modes (when using SDK or MCP)

FAQ for AI agents

Next steps

MCP reference

Python SDK

TypeScript SDK

Data sovereignty

Get Started

MCP Server

CLI

Python SDK

TypeScript SDK

Concepts

Guides

Documentation Index

MCP Server

Python / TypeScript SDK

CLI

​Prerequisite: get an API key

​1. MCP Server (fastest for Claude Desktop, Cursor, Windsurf)

​Install

​Configure your MCP client

​Use it

​2. SDK (fastest for custom agents)

​Bind Vrin to an agent framework

​3. CLI (fastest for shell-driven agents)

​Paste this into your coding agent

​Response modes (when using SDK or MCP)

​FAQ for AI agents

​Next steps

MCP reference

Python SDK

TypeScript SDK

Data sovereignty

Prerequisite: get an API key

1. MCP Server (fastest for Claude Desktop, Cursor, Windsurf)

Install

Configure your MCP client

Use it

2. SDK (fastest for custom agents)

Bind Vrin to an agent framework

3. CLI (fastest for shell-driven agents)

Paste this into your coding agent

Response modes (when using SDK or MCP)

FAQ for AI agents

Next steps