Skip to main content

insert()

Insert text content into the knowledge base. Vrin chunks the text, extracts facts (entities + relationships), and indexes everything for retrieval.
result = client.insert(
    "ACME Corp reported $50M revenue in Q4 2025, up 23% YoY.",
    title="ACME Q4 Earnings"
)
By default, insert() waits for processing to complete. Pass wait=False to get a job ID and poll later.

Parameters

content
string
required
Text content to insert into the knowledge base.
title
string
default:"Untitled"
Document title. Used in search results and source attribution.
tags
List[str]
Optional tags for categorization and filtering.
metadata
Dict[str, Any]
Optional metadata dict attached to the document.
wait
bool
default:"True"
If True, poll until processing completes and return the result dict. If False, return the job ID string immediately.
poll_interval
float
default:"2.0"
Seconds between status polls when wait=True.
max_wait
float
default:"300.0"
Maximum seconds to wait when wait=True. Raises TimeoutError if exceeded.

Synchronous (default)

result = client.insert(
    "ACME Corp reported $50M revenue in Q4 2025.",
    title="ACME Financials",
    tags=["earnings", "2025"]
)
# result contains facts_extracted, chunk_id, etc.

Asynchronous

job_id = client.insert(
    "Long document content...",
    title="Annual Report",
    wait=False
)
print(f"Job started: {job_id}")

# Check status manually
status = client.get_job_status(job_id)
print(status["status"])  # "pending" | "chunking" | "extracting" | "completed"

# Or wait for completion later
result = client.wait_for_job(job_id)

get_job_status()

Check the status of an async insertion job.
status = client.get_job_status("job_abc123")

Parameters

job_id
string
required
The job ID returned by insert(wait=False).

Returns

{
  "job_id": "job_abc123",
  "status": "extracting",
  "progress": 0.65,
  "message": "Extracting facts from chunks..."
}
Job statuses progress through: pending -> chunking -> extracting -> storing -> completed.

wait_for_job()

Poll a job until completion or timeout. Logs progress as the job moves through stages.
result = client.wait_for_job("job_abc123", poll_interval=3.0, max_wait=120.0)

Parameters

job_id
string
required
The job ID to wait on.
poll_interval
float
default:"2.0"
Seconds between status polls.
max_wait
float
default:"300.0"
Maximum seconds to wait. Raises TimeoutError if exceeded.

Exceptions

  • JobFailedError — The job failed during processing.
  • TimeoutError — The job did not complete within max_wait seconds. Use get_job_status() to check current state.

get_knowledge_graph()

Get knowledge graph visualization data showing entities and their relationships.
graph = client.get_knowledge_graph(limit=50)
# Returns nodes (entities) and edges (relationships)

Parameters

limit
int
default:"100"
Maximum number of graph elements to return.

What happens during insertion

When you call insert(), Vrin:
  1. Chunks the text into overlapping segments optimized for retrieval
  2. Extracts facts — entities, relationships, and attributes using an LLM
  3. Stores facts in the knowledge graph (Neptune) with {model, timestamp, confidence} metadata
  4. Indexes chunks in the vector store (OpenSearch) with BM25 + kNN embeddings
  5. Returns a summary with fact counts and chunk IDs