Platform

One platform for every regulated RAG workflow.

Ingest. Retrieve. Answer. Adjudicate. Audit.

Citorum is a packaged Retrieval-Augmented Generation (RAG) platform built for organizations whose data cannot leave the perimeter. It connects to the document stores you already run, indexes them inside your environment, and answers questions with citations and a faithfulness score on every response.

What Citorum does

A team using Citorum sees four things: a chat interface where they ask questions about their corpus, a document library showing what's been ingested, an admin surface for managing access and tags, and an audit view showing the lineage of any answer.

Underneath, the platform handles five jobs:

  1. Ingest documents from the connectors you already run — direct upload, Server Message Block (SMB) and Network File System (NFS) shares, object storage (Amazon Simple Storage Service-compatible endpoints including MinIO, Pure Storage FlashBlade, NetApp StorageGRID, Dell Elastic Cloud Storage), ownCloud and OpenCloud, Nextcloud, web crawl, audio and video transcription, or the ingestion Application Programming Interface (API).
  2. Index parsed text and metadata into your Postgres database, with embeddings stored in your pgvector indexes and access-control attributes attached to every record.
  3. Retrieve with a hybrid pipeline that runs dense vector similarity, full-text search, and metadata filtering in parallel and rank-fuses the results. The user only sees documents they have permission to see.
  4. Answer by composing the retrieved context into a prompt for the Large Language Model (LLM) running on Graphics Processing Units (GPUs) inside your deployment. The model returns the answer plus citations to the document spans it drew from.
  5. Adjudicate the answer against its sources before returning it to the user. An ensemble of grounding signals plus an independent model judge assigns one of three faithfulness labels — Verified — Cite Source, Review Recommended, or Do Not Rely — Consult Expert — and every per-signal score is captured in the audit log alongside the response.

How the architecture works →

Capabilities

What you can build on it

Knowledge Q&A

Ask questions across your full corpus — contracts, policies, clinical references, regulatory documents, customer files. Get cited answers with confidence labels.

Discovery & research

Surface the documents that matter for a brief, an audit, an investigation, or a clinical review. Faceted filtering on tags, dates, and access scope.

Drafting & summarization

Produce first drafts grounded in your corpus, with every claim traceable back to source spans. Particularly strong for compliance memos and regulatory filings.

Audio & video ingestion

Transcribe and index meeting recordings, depositions, clinical dictations, and earnings calls. Same retrieval, citations, and adjudication as text documents.

Crawled web sources

Pull regulatory bulletins, case law, vendor documentation, and competitor filings into the same index, with provenance attached at ingestion.

Embedded in your application

Use Citorum as a backend for an internal tool you build yourself, via the Retrieval-Augmented Generation (RAG) Application Programming Interface (API). Same isolation, same audit, same faithfulness scoring.

What runs where

Three deployment topologies, all keeping the corpus and the inference path inside the customer's control boundary: on-premises on customer-owned hardware; inside a customer-owned Virtual Private Cloud (VPC) on Amazon Web Services, Microsoft Azure, Google Cloud Platform, or Oracle Cloud Infrastructure; or as a Citorum-managed dedicated tenant in a Citorum-operated account on the customer's behalf.

The architectural shape is identical in all three modes. Topology only changes who operates the runtime; it does not change how the system works. Identity stays in your Identity Provider, encryption keys stay in your Key Management Service, audit logs export to your Security Information and Event Management (SIEM) system, and the model runs on hardware you control. Architecture & Security →

15+
Document connectors out of the box
Direct upload, SMB/NFS, object storage, ownCloud, OpenCloud, Nextcloud, web crawl, transcription, ingestion API
3
Retrieval signals combined per query
Dense vector, full-text, metadata — rank-fused with access-control filtering
5
Faithfulness signals per answer
Grounding, contradiction, citation, and independent-judge checks combined per answer
100%
Of every prompt, response, and signal score logged
With chain-of-custody metadata; default seven-year retention

See it on your corpus.

A pilot starts with a sample of your documents and a list of the questions your team most often needs answered. We stand the platform up inside your environment, you run real questions through it, and your security team has the documentation they need to sign.