AI.DI by imkore — Document Intelligence Platform

Overview · Tab 01

The Document Intelligence Category Just Changed.

For thirty years, document management meant storage, organization, and search. AI.DI redefines the category entirely: Document Intelligence means every document is certified, extracted, scored, distributed, and queryable by any AI system — automatically, continuously, at scale. No incumbent has built this. We have.

Full Platform Architecture — Reproduced from AI.DI Statement of Outcomes

Why the Legacy Platforms Cannot Follow

The Architecture Problem

They Bolted AI Onto Storage. We Built Intelligence From the Start.

M-Files, Box, SharePoint Copilot, and Hyland all share the same architectural constraint: a file storage system built in a different era with AI features layered on top. Layered AI produces generic summaries from raw files. Native AI produces structured intelligence from every field of every document — something you can actually query, export, and act on.

You cannot retrofit the AI.DI Document Warehouse onto a file storage system. The warehouse is not a feature. It is the foundation. Building it requires starting over. None of them will.

The Data Problem

Documents and Data Have Always Lived in Separate Systems. We Ended That.

Every enterprise runs a document system and a data warehouse in parallel. The document system stores files. The data warehouse stores numbers. The information in the document — the obligation term, the coverage ratio, the commitment date, the counterparty agreement — lives in neither system. It is trapped in the PDF.

Abstract.DI ends this permanently. Every document becomes a structured database record the moment it enters the platform. The PDF is the backup. The warehouse row is the truth.

The Trust Problem

AI That Reasons From Uncertified Documents Produces Uncertified Answers.

Enterprise AI deployments fail for a predictable reason: the documents feeding the model are unverified, duplicated, and structurally inconsistent. Copilot hallucinates because SharePoint is untrustworthy. The model is not the problem. The data is.

Sentry certifies every document before it enters the AI pipeline. An AI agent using AI.DI as its knowledge base cannot be given a falsified document — the fingerprint will not match. Every answer is traceable to a specific certified version with a confidence score. That is a different category of AI deployment entirely.

The Intelligence Flywheel — Why It Compounds

Compounding Platform Value

Documents flow into Document Gateway. Abstract.DI extracts intelligence from every one. Sentry fingerprints and certifies them. The Document Warehouse stores all of it as structured, queryable data. The Warehouse improves Abstract.DI model accuracy. Better accuracy improves Sentry signals. Better signals make Document Gateway more valuable. More value drives more documents. After 18 months, switching costs are effectively permanent — and accuracy measurably exceeds any out of the box alternative.

Platform Layer Stack — Complete Feature Inventory

Document Gateway — Exchange & Distribution Engine

Check-In Studio · Distribution Studio · Transaction Rooms · ML Learning Studio · 200+ components · 29 edge functions

Core OSAny industry

The central operating system for every document. Replaces Box, SharePoint, Egnyte as the primary system of record while connecting to any of them as migration sources. React/TypeScript/Vite + Supabase + Deno Edge Functions + Cloudflare R2 storage.

Check-In Studio

AI powered document intake with AbstractIQ auto classification, batch template mode, session history, rejection/resubmission pipeline, and external submitter portal.

Distribution Studio

Unified hub replacing 6 legacy distribution workflows. Standing distributions, serialized delivery, access tracking, client branding engine, full audit trail.

ML Learning Studio

30 self improving AI engines across 6 capability tiers. Org specific model weights. 4 map views. Continuous accuracy improvement.

Abstract.DI — AI Extraction Engine

Any doc type · 94% day one confidence · 100K batch chunks · GPU OCR · Anomaly detection · Custom schema builder

AI nativeCore moat

Reads every document and converts it into structured, queryable intelligence. Multi pass pipeline: OCR → classification → extraction → confidence scoring → anomaly detection → warehouse write. Different models optimized per document type.

Batch Engine

Process entire archives in 100K-document chunks. ZIP, Box, SharePoint, S3. Output to Excel, JSON, CSV, or Warehouse.

Any Document Type

Custom schema builder for proprietary types — hours not months. Prebuilt schemas for legal, financial, compliance, healthcare, HR, and government.

GPU OCR

DocTR engine. CPU and GPU — 10x to 50x speedup on GPU. Selective OCR for maximum cost efficiency.

Sentry Document Assurance — Trust & Compliance Monitor

~10,000 fingerprints · Zero doc storage · Patent pending · GDPR/HIPAA/SEC by architecture · 10 to 100x faster search v2

TrustCompliance

Deterministic mathematical fingerprinting. Zero document storage — only immutable fingerprints. Three types: Document Content, Document Data, Trusted Data Fingerprints (unique in market — fingerprint individual database rows, find every document referencing that entity).

Duplicate Elimination

40%+ industry average duplicate rate = 40% wasted AI spend. 30 to 50% LLM cost reduction immediately.

Cross System Search

Search SharePoint, OneDrive, Windows Share, email archives, ERP, FileStar simultaneously. No tags. No training.

PII Redaction

Auto-detects and redacts SSNs, financial IDs, tax IDs before fingerprint storage. GDPR data minimization by mathematics.

AI.DI Document Warehouse — Structured Intelligence Layer

PostgreSQL · SQL/GraphQL/REST · Snowflake · Databricks · MCP · Vector embeddings · 6 query views

Data moatBI connectors

The biggest differentiator in the market — and it doesn't exist anywhere else. Every document Abstract.DI processes becomes structured rows in PostgreSQL. Every extracted field is queryable data. Every AI signal is persisted as a structured record.

6 Query Views

List · Library · Cube (pivot) · Time series · Schema · Scientist mode. Every dimension instantly explorable.

BI Connectors

Snowflake Data Share, Databricks, Tableau, Power BI, dbt, BigQuery, Python SDK. Zero ETL overhead.

Event Streaming

Webhook Manager fires events on every platform action. Real time pipeline triggers for any downstream system.

AI Orchestration & Agent Gateway

LLM agnostic · MCP server · RAG foundation · OAuth2/OIDC · Zero hallucination

AI agentsRAG substrate

The AI layer that makes every enterprise LLM investment actually work. Not competing with LLMs — the prerequisite. Works with Copilot, GPT-4, Claude, Gemini, Llama, or any custom LLM. MCP server callable from Claude, Cursor, LangChain, AutoGen.

MCP Server

Available. Callable from any MCP compatible environment. No custom integration layer required.

LLM-Agnostic

Client chooses the AI model; AI.DI provides the trusted foundation. No vendor lock-in to any LLM.

RAG Foundation

Every chunk provenance tracked, every answer traceable to a specific certified document version. Zero hallucination.

Millennia FileStar — Document Governance & Fabric

Founded 1996 · Trusted by enterprise clients · SSAE 18 certified · AI.DI integration pathway

Installed baseOn ramp

The governance engine powering the AI.DI platform. FileStar governs document lifecycle and syncs all metadata to the AI.DI Warehouse — turning every FileStar deployment into a warm on-ramp to the full AI.DI platform.

Document Fabric

Structured lifecycle governance for any document type. Configurable routing, approval chains, escalation paths.

AI.DI Integration

FileStar governs; Warehouse stores; Sentry certifies; Abstract.DI extracts — automatically on every FileStar doc.

Installed Base

8–10 Phase 1 upgrade targets. 20+ year institutional trust that no competitor can replicate regardless of funding.

Platform Combinations — Entry to Full Suite

DG alone

Day one value. Replaces Box.

Document storage, hierarchy, roles, and lifecycle management for any organization. 30 day deployment. No implementation project.

"We're better than Box at organizing documents for your specific org structure. Same price. Setup in a week."

DG+Abstract.DI

The intelligence upgrade.

Every document uploaded turns into structured, searchable data automatically. No manual tagging. 94% accuracy on day one.

"Every document you upload now tells us what it says, who it's about, and when it expires — without you doing anything."

DG+Sentry

The compliance overlay.

Sentry wraps any existing repository — Box, SharePoint, Egnyte — and adds continuous compliance monitoring without replacing storage.

"Keep your existing storage. We add the trust layer that tells you what's missing, what's expired, and what's been altered."

All six engines

The platform. The moat.

All six engines create a compounding intelligence flywheel. The platform becomes permanently irreplaceable.

"Your documents are now a structured, AI certified, queryable intelligence asset. This is a different category."

"M-Files has been building their AI product since 2019. Box added AI features in 2022. SharePoint Copilot launched in 2023. None of them started from scratch. None of them can. We did. That is the only advantage that cannot be copied."

— Platform architecture principle

Overview · Tab 02

AI Intelligence — Built Ground-Up. Not Bolted On.

Every incumbent document platform bolted AI on top of a 20 year-old data model. AI.DI was architected in 2024 with AI as the primary actor — not the afterthought. 27 AI engines. 30 self improving ML models. An MCP server connectable to any LLM. This is not a roadmap.

The Competitive Barrier — Stated Plainly

Competitors have slides about AI. AI.DI has 200+ React/TypeScript components, 29 live serverless edge functions, an ML Learning Studio with 30 self improving engines, a MCP server, an AI Agent Gateway connecting to Claude/Copilot/GPT-4/Gemini, and a production AI.DI Studio running 27 active AI engines. The gap between what competitors promise and what we have already shipped is measured in years of engineering. This is the unfair advantage that cannot be purchased with a VC round.

The AI.DI Studio — 27 Active AI Engines

documentgateway.ai

Click to enlarge

AI.DI Studio — Real Time Intelligence Infrastructure

AI Intelligence · 27 Active Engines · 5 Capability Domains

27 AI engines running simultaneously across 5 capability domains — purpose built for enterprise document intelligence. Each column is a domain: AI Core handles classification, extraction, confidence scoring, and the HITL Reduction meta-engine that makes the entire platform self improving. Intelligence manages deep document comprehension, cross document validation, expiry detection, and the document type registry that routes every document to the right schema. Process automates the operational pipeline — OCR, obligation extraction, approval routing, distribution rule execution, workflow management, and industry specific feature configuration. Trust & Security enforces immutability through blockchain anchoring, continuous tamper detection, and database level access control. Data/Integration maintains the warehouse sync, API gateway, registry, and storage optimization engines that keep the entire data pipeline running at scale. Every node shows live metrics: documents processed, classifications made, conflicts detected, connections active. This is the intelligence infrastructure. No competitor has built it. No competitor can.

HITL Reduction — The Self-Improving Loop

Human-In-The-Loop Reduction Architecture

The HITL Reduction AI engine monitors all other engines' human review rates and autonomously moves classifications to auto approve when confidence consistently exceeds configurable thresholds. Standard document types trend toward zero human intervention at 12 months. Novel or edge case documents always retain human oversight — the goal is the right humans reviewing the right exceptions, not zero humans.

Legacy Platform HITL at 12 Months

Standard contracts65%

Insurance certificates55%

Financial statements70%

Fixed model weights. No production learning. Same cost and error rate at month 12 as month 1.

AI.DI HITL at 12 Months

Standard contracts8%

Insurance certificates5%

Financial statements12%

Continuous production learning. Every human correction retrains the model automatically. No ML engineers required.

Blockchain Engine — Immutable Audit Trail

documentgateway.ai

AI.DI Studio — Blockchain Engine · On-Chain Document Integrity

Click to enlarge

AISENTRY

AI.DI Studio — Blockchain Engine · On-Chain Document Integrity

AI Intelligence · Trust Engine · Ethereum / Hedera / Polygon

2,814 documents have been anchored on chain through this engine — each one generating a Merkle tree hash committed to Ethereum, Hedera, or Polygon as an immutable proof of existence and content at a specific point in time. The HITL Reduction panel shows 100% automated — meaning the Blockchain Engine requires zero human review at this tenant after 12 months, because blockchain anchoring is deterministic: if the fingerprint matches, it anchors; no judgment required. The Pages Powered By This Engine panel shows exactly where this engine's certifications surface in the UI: Document Vault, Asset Vault, and Verification Portal. This is not a compliance checkbox — it is the infrastructure that allows a document to be presented to a regulator, counterparty, or auditor with cryptographic proof that its content has not changed since a specific timestamp. No document management platform on the market ships this out of the box.

Integration Studio — Connect Any AI Agent

documentgateway.ai

Integration Studio — Live AI Agent Gateway

Click to enlarge

AIORCHESTRATION

Integration Studio — Live AI Agent Gateway

AI Orchestration · MCP Server + 3 Connected AI Systems

This screen represents the moment enterprise AI deployment becomes real. Three AI systems are connected: Claude.ai, ChatGPT/OpenAI, and FileStar. Each has read only, row level-security-enforced access to the entire certified document corpus through 6 production tools. The MCP Server URL is a published endpoint — connect it to Claude, Cursor, LangChain, or any MCP compatible environment and the AI gains the ability to search certified documents, check compliance status, retrieve obligations, query the warehouse, navigate the hierarchy, and retrieve signed document URLs. Keys are tenant-scoped, revocable instantly, and enforce the same RLS policies as the UI. The AI Agent Gateway panel on the left shows Microsoft Copilot, Gemini/Google, and Grok/xAI as available-to-add connections — meaning your entire AI vendor portfolio can query the same trusted document foundation. This is the infrastructure that makes every LLM investment in your organization actually work.

Integration Ecosystem — 28 Connectors

documentgateway.ai

Integration Studio — 28 Connectors Across Every Enterprise System

Click to enlarge

AIORCHESTRATION

Integration Studio — 28 Connectors Across Every Enterprise System

AI Orchestration · Full Connector Ecosystem

The most common objection to any new platform is "we already use X." AI.DI answers it by connecting to every X simultaneously. AI agent platforms — Claude.ai and ChatGPT are fully integrated; Microsoft Copilot, Gemini, and Grok ready to configure. Enterprise ERPs push operational data, financial reports, and contract records directly into the ingestion pipeline on a scheduled or event-driven basis — no manual export, no batch required. Document management systems (SharePoint, Google Drive, Box, OneDrive, Dropbox) connect as source systems: AI.DI reads, certifies, and extracts from your existing storage without requiring you to move a single file. CRM platforms push agreements and correspondence as structured ingestion records. Data warehouse connectors deliver extracted intelligence outbound to Snowflake, Databricks, BigQuery, and Redshift on configurable schedules. Every connector is configured through a guided AI wizard — no IT project, no professional services, no custom code required. The platform fits into the enterprise as it currently exists, not as it would need to be rebuilt.

Document IQ — Conversational AI Over Your Certified Corpus

documentgateway.ai

Document IQ — AI Powered Document Intelligence Assistant

Click to enlarge

AIORCHESTRATION

Document IQ — AI Powered Document Intelligence Assistant

AI Orchestration · Conversational AI · Portfolio-Wide Access

Document IQ is what happens when you give an AI system access to a certified, structured document corpus instead of raw PDFs. With access to an enterprise corpus, it can answer questions that would take a human analyst days: "What's missing from the vault?" surfaces every gap across every asset simultaneously. "Show critical risk items" aggregates all violation flags and expiry warnings into a single prioritized view. "Expiring in the next 30 days" is a precise query against structured expiry dates in the Warehouse — not a keyword search, not an approximation. Upload any file and Document IQ cross references it against vault data in real time: upload a critical data extract and it identifies which tenants aren't in the vault, which leases are missing, which figures don't match the abstracted data. This is not a chatbot bolted onto a document management system — it is an AI with structured, trusted data access that no general-purpose LLM can replicate without the Warehouse underneath it.

ML Learning Studio — 30 Engines, 6 Tiers

The Self-Improvement Architecture

Every legacy DMS has fixed classification models requiring expensive, time-consuming retraining. AI.DI's ML Learning Studio inverts this entirely — 30 engines improving continuously from production data, automatically, without engineering intervention. AI.DI gets cheaper and more accurate at scale. Every competitor's cost stays flat or increases.

Tier	Focus	Example Engines	HITL Trajectory
Tier 1 — Foundation	Document type classification	Enterprise Type Classifier, PE Type Classifier, Legal Type Classifier	Near-zero for covered types
Tier 2 — Entity	Named entity extraction	Party Extractor, Property Identifier, Fund/Entity Linker	5–15% at 6 months
Tier 3 — Date & Validity	Temporal signal extraction	Expiration Detector, Effective Date Parser, Renewal Classifier	Near-zero for standard formats
Tier 4 — Financial	Financial data extraction	Loan Terms Extractor, Critical Data Extract Parser, Appraisal Value Extractor	10–20% at 6 months
Tier 5 — Compliance	Compliance validation	Coverage Gap Detector, Compliance Flag Engine, Signature Validator	15–25% — domain expertise retained
Tier 6 — Cross-Document	Cross-document consistency	Portfolio Benchmark Engine, Anomaly Correlator, Reconciliation Engine	Complex analysis — strategic HITL

"We didn't build a document platform and add AI. We built an AI platform that happens to manage documents. The difference is not semantic. It is architectural. And architecture determines destiny."

— AI.DI platform design principle

Overview · Tab 03

The Honest Battle Table — Where We Win and Why It's Structural

We do not win on every dimension today. What matters is the architecture. An incumbent can add a feature. No incumbent can add a clean data model, a zero-legacy stack, or an AI engine designed in from the first line of code.

Why Legacy Platforms Cannot Catch Up

Box cannot rebuild their data model for AI without breaking 150,000 customers. SharePoint's incentive is to preserve Teams and Office revenue, not cannibalize Copilot. M-Files is 2–3 years behind on the data model and the Warehouse layer. Egnyte wins on storage reliability but has no awareness of what documents contain. Every dollar these platforms invest in AI is constrained by the need to not break existing products. That constraint does not exist for AI.DI.

Full Capability Matrix

Capability	AI.DI Platform	Box	SharePoint	M-Files	Egnyte
Architecture & Philosophy
AI native architecture (built for AI, not adapted)	Win 2024-2025. Zero compromise. AI is core, not a wrapper.	Bolt-on	Copilot wrapper	Aino — improving but bolted on	Minimal
Zero legacy technical debt	Win No codebase older than 18 months.	2005 origin	2001 origin	2003 origin	2009 origin
Edge compute architecture	Win All compute at edge. Scale to zero or infinity.	None	Azure Functions (partial)	None	None
Modular adoption (standalone or full suite)	Win Every engine has standalone value.	Partial	Module-based but complex	Partial	Partial
AI & Document Intelligence
Structured data extraction from documents	Win Abstract.DI — any type, 94% day one, 100K batch.	None	Basic Copilot extraction	Aino — requires training	None
Day one extraction accuracy (no training)	Win 94%+ on prebuilt schemas. No training required.	N/A	N/A	Months of training	N/A
GPU accelerated OCR pipeline	Win DocTR — 10-50x speedup on GPU.	None	Azure OCR (limited)	Basic OCR	Basic OCR
Batch processing (100K+ archives)	Win 100K-chunk batch. ZIP, Box, SharePoint, S3.	None	None	Limited batch	None
30 self improving ML engines	Win Continuous production learning. No ML engineers.	None	Generic Copilot	Limited self-learning	None
HITL Reduction AI (autonomous meta-engine)	Win Autonomous promotion of high-confidence classifications.	None	None	None	None
Trust, Compliance & Security
Document fingerprinting (deterministic, patent pending)	Win ~10,000 fingerprint catalog. Zero doc storage.	None	None	None	None
Zero document storage compliance model	Win Only fingerprints stored. GDPR minimization by math.	Full storage	Full storage	Full storage	Full storage
PII auto detection and redaction pipeline	Win Tokenization pipeline auto redacts at ingestion.	None	Purview (partial)	None	DLP (partial)
Fraud / document manipulation detection	Win Deterministic — single character change detectable.	None	None	None	None
Blockchain audit trail	Win On chain anchoring. 2,814+ documents on chain.	None	None	None	None
Data & AI Infrastructure
Structured document intelligence warehouse	Win Every extracted field is a queryable row. Unique.	None	None	None	None
Snowflake Data Share (zero ETL)	Win Zero-copy. Join doc intelligence with financial data.	None	None	None	None
MCP server for AI agents	Win Production MCP. Claude, Cursor, LangChain — no wrapper.	None	None	None	None
Vector embeddings on certified chunks	Win Tied to certified versions. pg_vector native.	None	Azure AI Search (partial)	None	None
CTR Score (Continuous Transaction Readiness)	Win Live composite readiness score. Portfolio-wide.	None	None	None	None
27 active AI engines in production	Win AI.DI Studio — live engine map with real time status.	None	None	None	None
Deployment & Integration
Unlimited hierarchy depth (any org structure)	Win Enterprise → Group → Entity → Asset → Unit. Any depth.	Folders only	Sites/subsites	Metadata based	Folders/workspaces
30 day deployment (no implementation project)	Win 30 days from contract to live. M-Files runs 3–6 months.	Weeks–months	Months–years	3–6 months typical	Weeks–months
Installed base / existing trust relationships	Win 45 FileStar enterprise clients. 20+ year relationships. Zero CAC.	Large (hard to access)	Large (bundled)	Existing clients	Existing clients

One-Line Positioning Per Competitor

vs. Box

"Box stores it. We understand it."

Never compete with Box on storage. Lead with the batch engine demo — upload a folder of contracts, produce a structured Excel workbook in 4 hours. Box produces a list of file names. Do not ask them to cancel Box. Ask what Box actually tells them about their documents.

vs. SharePoint

"Keep SharePoint. Add intelligence."

Never demand they cancel SharePoint. "SharePoint manages collaboration. AI.DI manages the intelligence layer — extraction, assurance, readiness scoring — running on top of whatever you already have. You do not have to change anything to start."

vs. M-Files

"Day one intelligence, not month-six."

M-Files Aino requires months of training. AbstractIQ delivers 90%+ confidence on day one for any document type — because schemas are prebuilt and AI classification runs from the first upload. "You bring the documents. We bring the intelligence."

vs. Egnyte

"You know where your files are. We know what they say."

Egnyte wins on hybrid storage reliability and IT governance. It has no awareness of what documents contain. AbstractIQ connects to an Egnyte repository as an intelligence overlay. "Add intelligence to Egnyte" — the displacement happens organically once the intelligence layer is live.

"We are not trying to be a better Box. We are trying to make Box irrelevant — the same way Salesforce made Act! irrelevant. Not by being louder. By being categorically different."

— imkore category strategy · 2025

Overview · Feature Gallery

Platform Screenshots — Every Screen. Real Production Data.

Every screenshot is from the platform environment at documentgateway.ai. No mockups. No renders. This is what the platform looks like, running today, with real data. 22 screens across 7 product areas. Click any screenshot to enlarge.

documentgateway.ai

Check-In Studio — Intelligent Document Intake

Click to enlarge

GATEWAYAI

Check-In Studio — Intelligent Document Intake

Document Gateway · AI Powered Intake Engine

Files dropped here enter a fully autonomous AI pipeline. AbstractIQ classifies document type, extracts fields, scores confidence at the field level, checks for duplicates against the ~10,000-fingerprint Sentry catalog, detects anomalies versus corpus patterns, and routes to the correct steward queue — before any human sees it. Required documents surface as named cards organized by packet template: what's needed, what's fulfilled, what's outstanding, what was rejected with AI identified reasons. As the corpus grows, the HITL Reduction engine autonomously promotes high-confidence document types to auto approve. The platform learns continuously from every correction. Standard document types trend toward zero human review at 12 months.

documentgateway.ai

Check-In Engine — Per-Tenant AI Configuration

Click to enlarge

GATEWAYAI

Check-In Engine — Per-Tenant AI Configuration

Document Gateway · ML Thresholds & Live Performance

Every AI parameter is independently configurable per tenant. Auto certify threshold (85%) means any document classified above that confidence approves without human review. Review threshold (65%) routes mid confidence results to steward queue. Below that triggers rejection with AI generated explanation. The AI Model Performance panel shows platform metrics: 78% auto classify rate, 91% average confidence, 47 human corrections in 30 days — each correction is permanent training data. Model last retrained March 12 against a 6,421-document corpus, with no ML engineer involvement. The platform learns from its own production use.

documentgateway.ai

Click to enlarge

GATEWAYDEVELOPERS

Check-In API — Full Programmatic Ingest

Document Gateway · Developer Interface

The same AI pipeline powering the visual Check-In Studio is fully accessible via REST API. POST to /v1/checkin/bulk submits ZIP bundles of up to 10,000 documents in one call. The response returns a job ID; the full AbstractIQ pipeline runs asynchronously and fires webhook events at every stage: classified, extracted, named, certified, review_required, rejected — each with full payload including document type, confidence score, extracted fields, routing decision, and anomaly flags. Your existing systems get real time notification at every pipeline stage without polling, enabling full downstream automation. The platform is not just a UI — it is a document intelligence API.

documentgateway.ai

Distribution Studio — Unified Distribution Hub

Click to enlarge

GATEWAY

Distribution Studio — Unified Distribution Hub

Document Gateway · Transaction Rooms · Packages · Share Links

This screen replaced six separate legacy distribution workflows. Transaction Rooms are secure, deal type-aware environments tracking who accessed which documents, for how long, from which organization — with CTR Score progress, expiry countdown, and phase completion visible at a glance. Every distribution event is immutable: timestamped, version-locked, recipient-specific, fully auditable. The engagement data tells you more about counterparty interest than any conversation — which documents they spent the most time on, which sections they returned to, which they never opened.

documentgateway.ai

Distribution Builder — Three Guided Wizard Modes

Click to enlarge

GATEWAY

Distribution Builder — Three Guided Wizard Modes

Document Gateway · Distribution Wizard

The Builder pre-configures deal type, counterparty structure, access controls, expiry settings, NDA gate options, and QA thread settings automatically based on your distribution type selection. Transaction Room launches a 7 step wizard (6 deal types, section-level access matrix, Abstract.DI engagement signal integration, CTR Score gap alerts). Document Package builds curated bundles with custom cover letters and per-recipient permissions. Share Documents sends direct, tracked links in 2 guided steps. No configuration guesswork — the platform knows the right settings for each distribution type.

documentgateway.ai

Distribution Analytics — Counterparty Behavioral Intelligence

Click to enlarge

GATEWAYVALUE

Distribution Analytics — Counterparty Behavioral Intelligence

Document Gateway · Deal Analytics

Distribution data becomes deal intelligence. Room engagement by deal type shows objective counterparty interest metrics that no email thread can provide. Phase completion rates reveal where deals stall portfolio wide. The document access heatmap shows which document types drive the most engagement per counterparty per room — signaling where interest is concentrated before a conversation happens. This is not analytics. It is a behavioral signal layer over every transaction.

documentgateway.ai

Abstract.DI — Structured Extraction from Any Document

Click to enlarge

ABSTRACTAI

Abstract.DI — Structured Extraction from Any Document

Abstract.DI · AI Extraction Engine · Any Document Type

Any document becomes a structured database record. The left panel shows the original document as received. The right panel shows every field extracted: parties, dates, financial terms, obligations, conditions, signatures, execution status — organized into typed field groups with individual confidence scores. Every extracted field is a live source link: click any field and the document scrolls to the exact source text it derived from. All extracted data is immediately written to the Document Warehouse as queryable PostgreSQL rows, available to any BI tool, API, or AI agent the moment extraction completes. This turns a folder of any document type into a structured database — automatically.

documentgateway.ai

Click to enlarge

AI.DI Studio — 27 Active Engines in Production

AI Intelligence · Real Time Intelligence Infrastructure

27 AI engines running simultaneously across 5 capability domains. AI Core handles classification, extraction, confidence scoring, and the HITL Reduction meta-engine. Intelligence manages deep document comprehension, cross document validation, and expiry detection. Process automates OCR, obligation extraction, approval routing, and workflow management. Trust & Security enforces immutability through blockchain anchoring and continuous tamper detection. Data/Integration maintains warehouse sync, API gateway, and storage optimization. Every node shows live metrics. No competitor has built this.

documentgateway.ai

Blockchain Engine — On-Chain Document Integrity

Click to enlarge

AISENTRY

Blockchain Engine — On-Chain Document Integrity

AI Intelligence · Trust Engine · Ethereum / Hedera / Polygon

2,814 documents anchored on chain with Merkle tree hashes committed to Ethereum, Hedera, or Polygon — creating cryptographic proof that a document's content has not changed since a specific timestamp. HITL Reduction shows 100% automated: blockchain anchoring is deterministic, requiring zero human judgment. Surfaces in Document Vault, Asset Vault, and Verification Portal. This is not a compliance feature — it is infrastructure that allows any document to be presented to a regulator, counterparty, or auditor with mathematical proof of integrity.

documentgateway.ai

Click to enlarge

ORCHESTRATIONAI

Integration Studio — Live AI Agent Gateway

AI Orchestration · MCP Server + 3 Connected AI Systems

Three AI systems connected: Claude.ai, ChatGPT/OpenAI, and FileStar. 6 production tools available: search_documents, get_compliance_status, get_obligations, query_warehouse, get_hierarchy, get_document_url. The MCP Server URL is a published endpoint — connect it to Claude, Cursor, or LangChain and the AI gains certified document access. Keys are tenant-scoped, revocable instantly, and enforce the same RLS policies as the UI. Microsoft Copilot, Gemini, and Grok are configured and ready to add. No other document platform ships an MCP server.

documentgateway.ai

Integration Studio — 28 Enterprise Connectors

Click to enlarge

ORCHESTRATIONAI

Integration Studio — 28 Enterprise Connectors

AI Orchestration · Full Connector Ecosystem

28 connectors across AI Agent Platforms, enterprise ERPs and CRM systems, Document Management (SharePoint, Google Drive, Box, OneDrive, Dropbox), Dev & Monitoring (error monitoring and observability tools), and Data Warehouses. The AI Wizard guides each connection in steps — no custom code, no IT project. AI.DI adds intelligence to your existing infrastructure rather than requiring you to abandon it.

documentgateway.ai

Click to enlarge

ORCHESTRATIONAI

Document IQ — AI Powered Document Intelligence Assistant

AI Orchestration · Conversational AI · Portfolio-Wide Access

AI with access to a structured, certified corpus instead of raw PDFs. "What's missing from the vault?" is a real time completeness calculation across every asset simultaneously. "Show critical risk items" aggregates all violation flags and expiry warnings portfolio wide in seconds. Upload any file and Document IQ cross references it against vault records — identifying discrepancies, missing documents, and data conflicts without manual comparison reports. This is what AI looks like when it has trusted data underneath it.

documentgateway.ai

Click to enlarge

WAREHOUSE

AI.DI Document Warehouse — Your Corpus as Structured Data

Warehouse · Live Query Interface · 9,857 Documents

Every row is a record in PostgreSQL with typed columns for every field Abstract.DI extracted. The Abstract.DI query bar accepts natural language and returns structured results — because the underlying data is structured, not a keyword search across unstructured text. Six view modes let a compliance officer, data analyst, AI engineer, and CFO each see the view that matches their workflow. This is the first document system where documents are a side effect — the real product is a continuously enriched, AI maintained structured database of everything your organization has ever received, produced, or executed.

documentgateway.ai

Warehouse Studio — Zero ETL BI Connectors

Click to enlarge

WAREHOUSEDEVELOPERS

Warehouse Studio — Zero ETL BI Connectors

Warehouse · Snowflake · Databricks · 9 Connectors

Snowflake receives rows on a 15-minute incremental sync via Data Share — zero copy, zero ETL. Databricks connects via Delta Lake. BigQuery, Redshift, Tableau, Power BI, dbt Cloud, and a Python SDK are all one-click configuration. The Warehouse is not just queryable — it is the most current, most complete, most structured view of your document estate available to every analytics system you operate simultaneously.

documentgateway.ai

Data Lineage Map — End to End Data Provenance

Click to enlarge

WAREHOUSEDEVELOPERS

Data Lineage Map — End to End Data Provenance

Warehouse · 15 Nodes · 17 Connections

Every piece of intelligence has a traceable origin. Source documents flow through ingestion, deduplication, fingerprinting, Abstract.DI extraction, and warehouse storage before reaching consumers (MCP Server, Snowflake, Webhooks). Stale nodes are visually flagged proactively. This is the audit trail that answers "where did this AI answer come from?" for every query, extraction, and alert the platform produces. Provenance is not an afterthought — it is the foundation.

documentgateway.ai

Document Type Studio — Complete Document Vocabulary

Click to enlarge

GATEWAYADMIN

Document Type Studio — Complete Document Vocabulary

Document Gateway · AI Powered Taxonomy

The document type taxonomy is the foundation everything else is built on — classification, extraction schemas, routing rules, compliance requirements, and CTR Score calculations all reference this vocabulary. Prebuilt libraries for every major industry. AI Generate suggests new types from your own document corpus patterns. Essential types drive CTR Score calculations; Elective types are tracked but don't penalize. The complete document vocabulary for your organization — maintained automatically.

documentgateway.ai

Hierarchy Studio — Any Org Structure, Any Depth

Click to enlarge

GATEWAYADMIN

Hierarchy Studio — Any Org Structure, Any Depth

Document Gateway · Organization Architecture

Not a folder structure — a typed, dimensional organizational model where every node carries permissions, document requirements, CTR Score calculations, process templates, library assignments, and AI extraction schemas. Any industry org structure configures without code or professional services. Every node becomes a first-class citizen in the Document Warehouse — queryable, scoreable, and connectable to any AI agent via the MCP server.

documentgateway.ai

Process Library — 11 Prebuilt Transaction Workflows

Click to enlarge

GATEWAYADMIN

Process Library — 11 Prebuilt Transaction Workflows

Document Gateway · Workflow Automation

Institutional knowledge codified into launchable templates. Every complex document transaction follows a predictable phase sequence — the Process Library captures that sequence and makes it repeatable, measurable, and improvable. Launching a process creates an active tracking instance with phase-by-phase document completeness, automatic CTR Score updates as documents are fulfilled, and distribution ready packaging when all phases complete. Ad hoc scramble becomes a managed, auditable workflow.

documentgateway.ai

Platform Masters — Document Status Workflow Engine

Click to enlarge

GATEWAYADMIN

Platform Masters — Document Status Workflow Engine

Document Gateway · Status Configuration

Document statuses are workflow triggers, not labels. Submitted auto routes to review queue. Approved fires the Approval Engine. Expired triggers the Violation Engine. Sentry Certified records an immutable fingerprint — can only be assigned by the fingerprinting pipeline, never manually. Terminal flags prevent override. The AI Suggestions panel detects gaps in your current status workflow and proposes additions based on your document patterns. The platform configures itself.

documentgateway.ai

Roles & Permissions — 9 Roles × 138 Features

Click to enlarge

GATEWAYADMINDEVELOPERS

Roles & Permissions — 9 Roles × 138 Features

Document Gateway · Identity & Access Control

138 features. 9 roles. 4 tiers. Row Level Security enforced at the database layer — not the application layer — meaning direct API access and MCP agent connections respect the same access boundaries as the UI. A Steward at a specific hierarchy node sees only their assigned assets. No orphaned permissions. No over-provisioned service accounts. Security is structural, not configured.

documentgateway.ai

Storage Manager — Automated Document Lifecycle

Click to enlarge

GATEWAYADMIN

Storage Manager — Automated Document Lifecycle

Document Gateway · Storage Intelligence

Automated lifecycle policies eliminate the operational overhead of storage management at scale. Documents move from Hot to Warm to Archive automatically based on access patterns, certification status, and file characteristics. Certified documents are immutable — hot storage is wasteful, so they archive automatically. Compliance documents are retained for the regulatory minimum automatically. Storage cost optimization runs continuously without any human intervention.

documentgateway.ai

White Label Branding — Full Enterprise Identity

Click to enlarge

GATEWAYADMIN

White Label Branding — Full Enterprise Identity

Document Gateway · Enterprise Branding

Every customer facing surface carries your organization's identity. Document Banner, Login Page, Email Templates, Certificates of Authenticity, Shared Viewer links — all branded to your organization. Master Logos propagate automatically to all surfaces on upload. Each surface can be individually overridden. For institutional clients sharing documents with investors, lenders, or regulators, the platform presents entirely as their own product — no "Powered by imkore" in any counterparty facing experience.

Products · Tab 05

Document Gateway — The Operating System for Every Document

The central hub where documents live, roles are configured, compliance is monitored, and every other AI.DI engine plugs in. 200+ React/TypeScript components. 29 live serverless edge functions. Replaces Box, SharePoint, Egnyte as the primary system of record.

200+

React/TS Components

Live Edge Functions

30 days

Avg. Deployment

5-tier

Org Hierarchy

ML Engines

Role Types

Check-In Studio — AI Powered Document Ingestion

documentgateway.ai

Click to enlarge

GATEWAYAI

Check-In Studio — Intelligent Document Intake

Document Gateway · The AI Intake Engine

Every file dropped here enters a multi-stage AI pipeline that runs entirely without human instruction. AbstractIQ classifies the document by type, extracts key fields, scores confidence, checks for duplicates, detects anomalies, and routes to the correct steward queue — all before a human sees it. Required documents are surfaced as named cards organized by packet template, so a steward's view is not a list of files but a structured set of obligations: what's needed, what's fulfilled, what's outstanding, and what was rejected with AI identified reasons. The HITL Reduction AI continuously monitors which document types consistently reach auto certify confidence and promotes them to bypass human review entirely. As your document corpus grows, the percentage of documents requiring human attention trends toward zero for standard types. This is not document management — it is an autonomous compliance engine that happens to accept file uploads.

Check-In Engine — AI Thresholds & Real Time Performance

documentgateway.ai

Check-In Engine Settings — Configurable AI Per Tenant

Click to enlarge

GATEWAYAI

Check-In Engine Settings — Configurable AI Per Tenant

Document Gateway · Per-Tenant ML Configuration

The thresholds in this panel determine exactly where human judgment enters the pipeline — and where the platform operates without it. The auto certify threshold (85%) means the majority of standard documents never reach a steward queue: they arrive, get classified, get extracted, get certified, and land in the vault without human contact. Documents falling between 65% and 85% enter the steward review queue with uncertain fields flagged and source passages highlighted — a steward corrects one field, not the whole document. Below 65% triggers automatic rejection with a machine-generated explanation of which extraction criteria fell short and why. The OCR engine selector switches between Abstract.DI v3, v2, and open source options without any pipeline changes downstream. The AI Model Performance panel shows live metrics: 78% auto classify rate, 91% average confidence, 47 steward corrections in 30 days — each correction is permanently written as labeled training data against a 6,421-document corpus. No ML engineer is involved. The model improves from its own production use, continuously, without a retraining project.

Check-In API & Webhook Integration

documentgateway.ai

Check-In API — Full Programmatic Document Ingest

Click to enlarge

GATEWAYDEVELOPERS

Check-In API — Full Programmatic Document Ingest

Document Gateway · Developer Interface

The same AI pipeline that powers the visual Check-In Studio is fully accessible through a REST API — meaning any internal system, any existing workflow, any document management tool can push files directly into the AI.DI pipeline without a user interface. POST to /v1/checkin/ingest for a single file; POST to /v1/checkin/bulk for ZIP bundles of up to 10,000 documents in one call. The response returns a job ID for status polling — the full AbstractIQ pipeline runs asynchronously and fires webhook events at every stage: classified, extracted, named, certified, review_required, rejected. Each event carries the full payload — document type, confidence score, extracted fields, routing decision, DG name, anomaly flags. This means your existing systems get real time notification the moment a document reaches any status, enabling downstream automation without polling. The platform is not just a UI — it is a document intelligence API that happens to have an excellent UI.

Distribution Studio — Studio View

documentgateway.ai

Distribution Studio — The Unified Distribution Hub

Click to enlarge

GATEWAY

Distribution Studio — The Unified Distribution Hub

Document Gateway · Transaction Rooms · Packages · Share Links

Distributed documents are the highest-risk surface in any organization — they leave your control the moment they are sent, and most platforms give you no visibility after that. Distribution Studio makes that surface observable, auditable, and permanently traceable. Every active Transaction Room shows CTR Score progress against the required document set, expiry countdown on time-sensitive items, counterparty engagement data by document, and phase completion in a single view. Document Packages show who received which version, when they opened it, and which sections they accessed. Standing Distributions show which recipients are on automatic schedules and what they last received. Share Links show whether the recipient clicked, when, and from which device. Every distribution event is timestamped, version-locked, recipient-specific, and logged permanently — producing, and per-recipient download permissions. Every distribution event is immutable — timestamped, version-locked, recipient-specific, and fully auditable. The engagement data flowing from these rooms tells you more about your counterparty's interest level than any conversation: which documents they spent the most time on, which sections they returned to repeatedly, and which they never opened.

Distribution Studio — Builder & Templates

documentgateway.ai

Distribution Builder — Three Wizard Modes

Click to enlarge

GATEWAY

Distribution Builder — Three Wizard Modes

Document Gateway · Distribution Wizard

Configuring a document distribution incorrectly — wrong NDA gate, wrong counterparty visibility, wrong expiry date, wrong access scope — is a compliance event, not an inconvenience. Distribution Builder eliminates misconfiguration risk by making the setup structural rather than manual. Selecting Transaction Room launches a 7-step guided workflow that auto-configures deal type, counterparty hierarchy, phase-based document structure, section level access matrix, NDA gate behavior, QA threading, and CTR Score gap alerts based on a single selection. Selecting Document Package configures bundling, per-recipient watermarking, and custom cover letter generation in 3 steps. Selecting Share Documents produces a tracked, expiring link in 2 steps. The platform applies the correct configuration for each distribution type — you choose the context, it builds the controls. The output is not just a sent document. It is a governed, auditable distribution event with full recipient behavioral tracking from the moment it opens. No configuration guesswork. No asking what settings to use. The platform knows.

documentgateway.ai

Distribution Analytics — Counterparty Intelligence

Click to enlarge

GATEWAYVALUE

Distribution Analytics — Counterparty Intelligence

Document Gateway · Deal Analytics

Counterparty intent has always been invisible — you send documents and wait for a phone call. Distribution Analytics ends that. Room engagement shows exactly how long each recipient spent on each document, which sections they returned to, and which they skipped entirely. A counterparty who spends 47 minutes on the indemnification schedule and ignores the financial statements is communicating something specific before any conversation happens. Phase completion rates surface where transactions stall across all active rooms simultaneously — giving teams an objective signal on process friction that no CRM captures. The document access heatmap shows which document types generate the most engagement per deal type, informing which materials to lead with in future transactions. When interest is concentrated in a document you expected to be routine, you know before you get on the call. This is behavioral intelligence over your entire distribution history — continuously updated, never requiring manual compilation.

Process Library — Prebuilt Transaction Workflows

documentgateway.ai

Click to enlarge

GATEWAYADMIN

Process Library — 11 Prebuilt Transaction Workflows

Document Gateway · Workflow Automation

Every complex document transaction follows a predictable phase structure — the documents required for a financing differ from those for an acquisition, a regulatory audit, or a counterparty onboarding, but each has a known sequence that most organizations rediscover from scratch every time. The Process Library ends that rediscovery cycle. A template defines the phases, the required documents per phase, the responsible roles, the estimated duration, and the distribution-ready package that assembles when all phases are complete. Launching a process creates a tracked instance with live phase completion, automatic CTR Score updates as each document is fulfilled, and stakeholder visibility throughout. Templates are versioned — when a better phase structure is identified, it becomes the new standard for all future instances immediately. A 7-phase template with 22 required documents and a 45-day estimated duration is not a checklist. It is institutional knowledge made repeatable, measurable, and improvable at organizational scale. Ad hoc document scramble becomes a managed, auditable workflow that gets faster every time the organization runs it.

Document Type Studio & Hierarchy Studio

documentgateway.ai

Click to enlarge

GATEWAYADMIN

Document Type Studio — Complete Document Vocabulary

Document Gateway · Document Taxonomy

Document type is not metadata — it is the instruction set that governs everything else in the pipeline. The classification label on a document determines which extraction schema applies, which fields are required, which routing rules trigger, which compliance obligations are checked, and how the CTR Score is affected. The Document Type Studio manages this vocabulary for the entire organization. Essential types drive CTR Score calculations directly — a missing Essential document drops the score and surfaces the gap in the Command Center immediately. Elective types are extracted and tracked but do not penalize readiness scores. The AI Generate function analyzes your existing document corpus and suggests new types based on structural patterns it detects — the taxonomy grows with your organization without manual taxonomy work. Each type carries a predefined extraction schema: the exact fields Abstract.DI will look for, the confidence thresholds required per field, and the anomaly detection rules that flag outliers against your established corpus patterns. The Diligence library alone contains 62 document types across Essential and Elective categories — prebuilt from years of real-world document intelligence deployments.rpus to suggest new types your organization actually uses that aren't in the default catalog. The platform ships prebuilt taxonomies for every major industry — configurable, extensible, and learnable from your own document patterns.

documentgateway.ai

Click to enlarge

GATEWAYADMIN

Hierarchy Studio — Any Org Structure, Any Depth

Document Gateway · Organization Architecture

Every organization has a structure — and every node in that structure carries a different document obligation, a different set of authorized users, a different CTR Score calculation, and a different AI extraction schema. Hierarchy Studio maps that structure precisely, without code, without professional services, without architectural constraints. Each node type carries its own configuration: required document libraries, role assignments, process templates, extraction schemas, and compliance obligations all attach at the node level. A user provisioned at a division node cannot see assets outside their scope — enforced at the database layer, not the application layer, through PostgreSQL row-level security. Hierarchy nodes are first-class citizens in the Document Warehouse: every query, every CTR Score, every AI agent call resolves to the node hierarchy the authenticated user belongs to. Corporate entity trees, regulatory division structures, branch networks, fund hierarchies, agency organizations — any org structure configures without changing the platform architecture.ions, corporate legal departments, financial institutions, and government agencies all configure different hierarchies from the same studio. Every node created here becomes a first-class citizen in the Document Warehouse — queryable, scoreable, and connectable to any AI agent via the MCP server.

Platform Configuration

documentgateway.ai

Click to enlarge

GATEWAYADMIN

Platform Masters — Document Status Workflow Engine

Document Gateway · Status Configuration

Document statuses are not labels — they are workflow triggers. Each status in this table drives a specific system behavior: Submitted auto routes to review queue, Approved fires the Approval Engine, Expired triggers the Violation Engine, Sentry Certified records an immutable fingerprint in vault_records. The drag-to-reorder interface sets the logical default sequence, but the real power is in the terminal and certified flags — terminal statuses cannot be manually overridden, and certified statuses can only be assigned by the Sentry fingerprinting pipeline, never by a user. The AI Suggestions panel on the right uses your industry and document patterns to propose status additions — "Add AI Flagged for low-confidence classifications" appears because the system detected classifications below the review threshold that currently fall through to Needs Revision without a distinct routing path. This is the platform configuring itself.

documentgateway.ai

Click to enlarge

GATEWAYADMINDEVELOPERS

Roles & Permissions — 9 Roles × 138 Features

Document Gateway · Identity & Access

138 features. 9 roles. 4 tiers. This is enterprise access control with the granularity that regulated industries require. The Role Matrix shows exactly which features each role can access — filtered by Actions, Data, or Pages — with color coded permission states (full access, limited, read only, none). The 4-tier structure (System, Tenant, Hierarchy, Node) means a Steward at a specific hierarchy node can only see documents and actions relevant to their assigned assets — not portfolio wide. Row Level Security enforcement happens at the database layer via Supabase RLS, not at the application layer — which means even direct API access or MCP agent connections respect the same access boundaries. No orphaned permissions. No over-provisioned service accounts. Security is structural, not configured.

Storage Management — Intelligent Lifecycle Automation

documentgateway.ai

Storage Manager — Automated Document Lifecycle Policies

Click to enlarge

GATEWAYADMIN

Storage Manager — Automated Document Lifecycle Policies

Document Gateway · Storage Intelligence

Documents cost money to store, process, and query — and most organizations keep everything in hot storage indefinitely because moving things manually never happens. Storage Manager automates the entire lifecycle through policy rules that run on configurable schedules. Auto-Warm After Inactivity moves documents not accessed in 30 days from Hot to Warm storage automatically. Archive Certified Docs moves Sentry certified documents to Archive on an hourly schedule — certified documents are immutable by definition, so hot storage is wasteful. Hot Retention for Active keeps any document accessed in the last 7 days in Hot tier regardless of other rules. Each tier has a defined retention schedule: Hot is indefinite for active docs, Warm moves certified docs to Archive after 180 days, Archive retains compliance documents for 7 years minimum. The platform manages storage cost at scale without operational overhead.

White Label Branding

documentgateway.ai

Click to enlarge

GATEWAYADMIN

White Label Branding — Full Enterprise Identity Control

Document Gateway · Enterprise Branding

Every customer facing surface of the platform — Document Banner, Login Page, Email Templates, Certificates of Authenticity, Shared Viewer links — carries your organization's identity, not imkore's. Upload Master Logos once (light mode and dark mode variants) and they propagate automatically to all surfaces. Each surface can also be individually overridden with a custom logo if different contexts require different branding. The Logo Across Surfaces panel on the right shows a real time preview of exactly how your logo appears on each surface before you publish. For institutional clients sharing documents with investors, lenders, or regulatory bodies, the platform presents entirely as their own product. This is the infrastructure that allows any organization to present a Transaction Room to a counterparty with full institutional branding — no "Powered by imkore" anywhere in the counterparty experience.

Core Engines and Studios

Engine 01

Check-In Studio — AI Document Intake

Every file enters a multi-stage AI pipeline before a human sees it. Abstract.DI classifies the document type, extracts key fields, scores confidence at the field level, checks for duplicates, detects anomalies, and routes to the correct steward queue automatically.

Drag and drop, bulk upload, email ingestion, and API submission
ZIP auto-extract with recursive file processing
Required document templates show what is needed, fulfilled, outstanding, and rejected
HITL Reduction AI promotes document types to bypass human review once confidence thresholds are consistently met
Batch Template Manager for bulk ingestion workflows across multiple use cases
Rapid Review Mode for high-volume steward queues

Engine 02

Distribution Studio V5 — Governed Document Exchange

Three distribution modes with full audit trails and recipient access controls. Every document leaves the platform certified and tracked.

Shared Documents: individual files distributed to named recipients with expiry, watermarking, and view-only enforcement
Document Packages: curated sets of related documents delivered as a governed bundle with version locking
Transaction Rooms: fully white labeled deal rooms with custom branding, NDA gates, engagement analytics, and counterparty-facing views
Resend integration for transactional delivery receipts
Recipient access log with timestamps, IP, and engagement depth

Engine 03

Submitter Gateway — External Document Collection

A purpose built external submission portal that presents to counterparties as your own branded platform. No account creation required for submitters.

Invitation-only access via tokenized secure links
Required document checklists with real time status
Automatic routing into Check-In pipeline upon submission
Submission Packets define exactly what documents are expected per counterparty type
Notification system with automated reminders for outstanding items

Engine 04

Command Center — Portfolio Operations Dashboard

Real time operational view across the entire document corpus — by entity, by division, by document type, or by compliance obligation.

CTR Score dashboard with per-entity readiness scores
Expiry tracking across all documents with escalation alerts
Outstanding obligation views by steward or entity owner
Anomaly feed showing AI-flagged discrepancies across the corpus
Executive reporting views with configurable KPIs

Engine 05

Document Vault — Governed Entity Repository

Five-tier organizational hierarchy providing structured, queryable document storage with role-enforced access at every level.

Configurable folder taxonomies per entity type and industry
Version control with full history on every document
Role-based access: Admin, Steward, Analyst, User, Viewer
Document Navigator for cross-entity search and bulk operations
Smart Folders with dynamic rule-based population

Engine 06

Approval Workflows — Governed Review Chains

Configurable multi-step approval chains for any document type or business process. Every workflow is auditable end to end.

Sequential and parallel approval routing with escalation paths
OnlyOffice JWT-enforced document review in-browser (DOCX, XLSX)
Annotation and comment threading per document version
Automated notifications at each workflow stage
Full audit trail on every approval, rejection, and comment action

29 Live Edge Functions — The Serverless Backbone

Every Document Gateway operation is powered by a dedicated Supabase Deno edge function — deployed independently, versioned separately, and executable on demand. Zero shared infrastructure between functions. Each function enforces its own auth, rate limits, and error handling.

Document Processing

Ingestion and Extraction Functions

ingest-document — upload validation, storage routing, and pipeline trigger
abstract-document — Abstract.DI extraction orchestrator for any document type
ai-classify — standalone classification endpoint for document type inference
checkin-pipeline — full OCR → classify → extract → score → route pipeline
process-upload-link — handles tokenized external upload URLs for Submitter Gateway
quick-verify — fast Sentry fingerprint verification for incoming documents
parse-credentials — secure credential extraction from document headers and metadata

Intelligence and Query

AI and Warehouse Functions

document-qa — natural language question-answering against any document or corpus
warehouse-query — SQL and natural language query execution against extraction fields
warehouse-connector — sync manager for Snowflake, Databricks, BigQuery, and webhook targets
agent-gateway — AI agent request router with tool dispatch and row-level security
mcp-server — dual-protocol MCP (Claude) and REST/OpenAPI (ChatGPT) gateway with 17 tools
smart-folders — rule engine that dynamically populates folder views from extraction data

Operations and Delivery

Workflow, Notification, and Integration Functions

send-notification — transactional email via Resend for workflow steps and alerts
send-submitter-invitation — tokenized invitation emails for Submitter Gateway counterparties
create-invite-user — new user provisioning with role assignment and welcome email
erp-webhook — inbound event handler for ERPs, CRM, and enterprise platforms
submit-anchor — Submission Packet anchoring and counterparty session management
schedule-jobs — cron-triggered orchestration for batch pipeline runs
run-scheduled-reports — automated report generation and distribution
filestar-proxy — FileStar API bridge for existing Millennia Group clients
oo-jwt — OnlyOffice JWT token generation for in-browser document editing
deployment-health — infrastructure health monitoring and status reporting
migrate-infra — schema migration runner for incremental database updates
seed-demo-data / seed-pe-samples — demo corpus seeding for private equity verticals
generate-demo-blueprint — AI-generated Blueprint diagnostic reports for prospective clients
update-whitepapers — automated whitepaper content refresh pipeline

Role Architecture and Access Control

Five-Tier Role System

Every user action in Document Gateway is governed by a five-role permission model enforced at both the application and database layers via Supabase row-level security policies.

Admin — full platform configuration, user management, integration setup, and AI engine settings. Advanced Check-In mode.
Steward — document review, certification, approval chain management, and AI override authority. Advanced Check-In mode.
Analyst — read access to all extraction data, Warehouse Studio, and reporting. Advanced Check-In mode.
User — document submission, basic search, and personal workflow tasks. Basic Check-In mode.
Viewer — read only access to shared documents and approved views. Basic Check-In mode.

Tech Stack and Infrastructure

Zero legacy code. Entirely 2024 to 2026 stack designed for sub-30-day enterprise deployment.

Frontend: React 18 / TypeScript / Vite — 200+ components, dark and light theme, DM Sans / DM Mono typography
Backend: Supabase PostgreSQL with PostgREST, 29 Deno edge functions, row-level security on every table
Storage: Cloudflare R2 for all document binary storage — zero egress fees at scale
Document Editing: OnlyOffice v7.5 via Railway container — JWT-enforced DOCX and XLSX in-browser rendering
Email: Resend for all transactional delivery with signed receipts
Deployment: Vercel auto-deploy — documentgateway.ai, trusteddocs.ai, imkore.ai

Deployment Model

Single-tenant, multitenant, Azure Cloud, AWS, on-premise, and hybrid deployments are all supported. Any file type. Any industry. Any org size. Average enterprise deployment: 30 days from contract to go-live. No professional services required for standard configurations.

Products · Tab 06

Abstract.DI — The Engine That Reads Your Documents

Abstract.DI does not summarize documents. It comprehends them — classifying every document type, extracting every meaningful field, scoring confidence at the field level, detecting anomalies against corpus patterns, and writing all of it as structured, queryable data into the Document Warehouse. Any document. Any industry. 94%+ confidence out of the box. No training required.

ANY

Document Type

94%+

Day One Confidence

100K

Batch Chunk Size

10 to 50x

GPU OCR Speedup

Day 1

Accuracy (not Month 6)

Abstract.DI in Action — AI Powered Document Comprehension

documentgateway.ai

Abstract.DI — Structured Field Extraction from Any Document

Click to enlarge

ABSTRACTAI

Abstract.DI — Structured Field Extraction from Any Document

Abstract.DI · AI Extraction Engine · Any Document Type

What you are seeing here is a document — any document — being converted into structured, queryable intelligence in seconds. The left panel shows the original document exactly as it arrived. The right panel shows every field Abstract.DI extracted: parties, dates, financial terms, obligations, conditions, signatures, execution status — organized into typed field groups with individual confidence scores. Every highlighted passage in the document is a live link: click any extracted field and the document scrolls to the exact source text it was derived from. This is not an AI summary — it is a structured database record created from an unstructured document, with full provenance tracing from field value back to source text. The 94%+ confidence score is field level, not document level — you know exactly which fields the AI is certain about and which need review. All extracted data is immediately written to the Document Warehouse as queryable PostgreSQL rows, available to any BI tool, API consumer, or AI agent the moment extraction completes. This is the engine that turns a folder of PDFs into a structured database.

Multi-Pass Extraction Pipeline

Abstract.DI — Document to Intelligence Pipeline

Step 1

Document Ingestion

→

PDF, DOCX, XLSX, PPTX, MSG/EML, CSV, ZIP, JPEG/PNG/TIFF, DB records

Step 2

Selective OCR

→

DocTR engine · GPU 10 to 50x speedup · Multilingual · 8s timeout with fallback

Step 3

Classification

→

Any doc type · Claude Haiku inference · "AI.DI Named" badge · Confidence scoring

Step 4

Field Extraction

→

Type specific schemas · Dates · Parties · Amounts · Obligations · Conditions

Step 5

Anomaly Detection

→

Cross-document consistency · Version comparison · Portfolio baseline deviations

↓

AI.DI Document Warehouse — Structured PostgreSQL Rows

All extracted fields stored as structured, queryable data. Available to BI tools, APIs, and AI agents instantly.

"Box shows you a file. We show you what the file says. Run both side by side. The demo closes itself."

— Abstract.DI positioning principle

The Extraction Pipeline — How It Works

Step 01

Ingest

File received via upload, email, API, or Submitter Gateway. ZIP files auto-extracted recursively. File validated and stored.

Step 02

OCR

Selective OCR applied only when it improves text completeness. GPU-accelerated at 10x to 50x CPU speed. Duplicate copies skipped.

Step 03

Classify

Document type identified from 5,700+ taxonomy entries. 78% auto-classify rate on day one without custom training.

Step 04

Extract

All meaningful fields extracted with individual confidence scores. Parties, dates, financial terms, obligations, signatures, clauses.

Step 05

Score and Route

Confidence checked against thresholds. Auto-certify, steward review, or flag. Duplicate and anomaly detection run in parallel.

Step 06

Warehouse

All extracted fields written as structured rows to PostgreSQL. Instantly queryable by SQL, natural language, BI tools, and AI agents.

AI Confidence Architecture — Three Zones

Zone 01 — Auto Certify

Confidence at or above the auto-certify threshold

Document passes through the pipeline without steward involvement. Default threshold: 85%. Configurable per document type or tenant. As the ML feedback loop accumulates corrections, more document types graduate to this zone. The HITL Reduction AI tracks which types are consistently above threshold and promotes them automatically — the percentage of documents requiring human review trends toward zero for standard document types over time.

Zone 02 — Steward Review

Confidence between review and auto-certify thresholds

Document routed to steward queue for field-level review. Stewards see exactly which fields are uncertain, with the source text highlighted in the original document. A single correction — changing a wrong date, confirming a party name — is fed back into the ML model as a labeled training example. Default review band: 60% to 85%. Every correction makes the next extraction more accurate.

Zone 03 — Flagged

Confidence below the review threshold

Document flagged for full manual review and possible reingestion. Default flag threshold: below 60%. Typically scanned documents with poor image quality, unusual layouts, or document types not yet in the training corpus. All flag events are tracked and used to prioritize which document types need additional training data. Flag rate declines as corpus grows.

OCR Engine Options

Primary Engine

Abstract.DI OCR v3 — Default

DocTR-based architecture with multilanguage support
GPU-accelerated processing — 10x to 50x faster than CPU-bound alternatives
Optimized for complex layouts: multi-column, rotated text, tables, and handwritten annotations
Selective execution — OCR applied only when it materially improves text completeness
Deduplication-aware: when one file in a duplicate group is processed, all copies are skipped

Legacy Engine

Abstract.DI OCR v2 — Stable Fallback

Proven accuracy on standard document formats
Available as fallback for tenants with specific compatibility requirements
CPU-bound processing — suitable for lower-volume deployments
Identical extraction pipeline output format — no downstream changes required when switching

Open Source Option

Tesseract — Client-Preferred Integration

Available when client contracts or compliance requirements specify open source OCR
Modular engine-agnostic architecture means any OCR engine can be substituted without pipeline changes
Tesseract output normalized to the same field extraction format as Abstract.DI native engines

What Gets Extracted — Field Schema by Category

Core Document Fields

Extracted from every document type regardless of content: node_path, hierarchy_path, doc_type, workflow_status, added_at, original_name, storage_path, period. These fields form the backbone of the Document Warehouse schema and enable cross-entity search across the entire corpus.

AI Extracted Fields

Present for documents where Abstract.DI has completed extraction: ai_fields (JSONB), extraction_confidence (numeric 0 to 100), entity_party (primary counterparty), primary_value (lead financial figure), start_date, end_date. These fields are present across a standard deployment corpus of thousands of documents.

Financial Fields

Extracted from financial statements, loan documents, and operating reports: coverage ratios, loan-to-value metrics, net operating income, revenue, net income, return metrics, and performance multiples. All numeric fields stored with full precision for direct BI tool consumption without transformation.

Operational Fields

Extracted from contracts, utilization reports, and entity records: utilization_rate, total_units, primary_counterparty, anomaly_flag (boolean — AI detected discrepancy). The anomaly flag is computed by comparing extracted values against corpus patterns. A coverage ratio of 0.4 in a corpus where the median is 1.8 raises the flag automatically.

AI Model Performance and Learning

Per-Tenant Model Performance

Abstract.DI maintains per-tenant model statistics that improve continuously as stewards interact with the platform:

Auto-classify rate: 78% of documents classified without steward input on a standard tenant
Average confidence: 91% across all extracted fields on certified documents
Training corpus: 6,421 steward-reviewed documents feeding the active learning loop
Steward corrections (30 days): 47 field-level corrections generating new labeled training examples
Model retraining: triggered automatically when correction volume exceeds threshold

ML Feedback Loop — How the Model Improves

Every steward action is a labeled training example. The model does not require separate annotation workflows or data science involvement.

Steward accepts a field → positive signal for that extraction pattern on that document type
Steward corrects a field → negative signal plus the corrected value as ground truth
Steward rejects a document → classification correction that updates the type inference model
ML Learning Dashboard shows training corpus growth, accuracy trends, and retraining schedule
Pipeline settings control: minimum documents before auto-certification, review window days, rapid review mode

Custom Schema Builder

Abstract.DI ships with prebuilt schemas for 5,700+ document types across industries. For document types outside the standard taxonomy, the Custom Schema Builder allows admins to define extraction targets — specify the fields you need, provide 3 to 5 example documents, and the model learns the pattern. No code. No data science team. New document type schemas are typically operational within one business day.

Products · Tab 07

Sentry Document Assurance — "Shazam for Documents"

Deterministic mathematical fingerprinting — patent pending. Zero document storage. Zero PII exposure. GDPR, HIPAA, SEC, and APA compliant by architecture, not configuration. Approximately 10,000 prebuilt fingerprints. Three unique fingerprint types including Trusted Data Fingerprints, which are unique in the market. Certify once. Comply everywhere.

~10K

Fingerprint Catalog

Documents Stored

10 to 100x

Search Speed v2

30 to 50%

LLM Cost Reduction

Patent

Pending Architecture

~13M

PubMed Records Indexed

Three Fingerprint Types

Type 01

Document Content Fingerprints

Full textual and structural content of any document. Two identical documents always produce identical fingerprints. Any change — a single word, date, number, or comma — produces a measurably different fingerprint. Deterministic mathematics. Zero false positives. Used for certification, version tracking, and fraud detection.

Type 02

Document Data Fingerprints

Structured data extracted from document fields and fingerprinted independently of full document content. Field level matching finds every document containing the same obligation term, coverage limit, or financial figure — without requiring full text identity. Supports cross document data validation and portfolio reconciliation.

Type 03

Trusted Data Fingerprints

Unique in the market. No equivalent exists anywhere. Fingerprint individual rows from Excel files and database tables — a vendor record, counterparty entry, or entity row — then find every document in the enterprise that references that specific record. Given any entity: instantly surface every connected document across the entire corpus.

30 to 50% AI Cost Reduction — Immediate

Industry average: 40% or more of files are duplicates. A corpus with 40% duplicates means 40% of every LLM bill computes the same content twice. Sentry identifies all duplicates, consolidates to canonical records, preserves all metadata from every duplicate instance, then suppresses duplicates from AI queues. LLM compute costs drop 30 to 50% immediately — without changing a single prompt or model.

2026 Dashboard and Analytics Enhancements

New in 2026

Operational Analytics and Decision Support

Advanced analytics on document distribution by source, format, status, and metadata profile. Enables evidence based prioritization of remediation, archival, and modernization initiatives.

Provides measurable KPIs for digital transformation monitoring, establishes compliance posture, and ensures data quality maturity. Decision makers can now act on document intelligence rather than document volume.

New in 2026

Enterprise Duplicate Intelligence

Enterprise wide visibility into unique and duplicate documents across all repositories. Quantifies storage, compliance, and operational risk exposure from redundant content.

Enables measurable cost reduction through defensible deduplication and lifecycle optimization — removing redundancy without sacrificing auditability or chain of custody.

New in 2026

Cross Repository Document Transparency

Unified discovery of unique documents across siloed systems and business applications. Identifies misplaced sensitive content and policy exceptions across environments.

Supports strategic document migration planning, normalization, and information governance initiatives across the full enterprise stack.

Next Generation OCR Engine

Sentry's OCR engine is fundamental to building digital document fingerprints, transforming visual content into structured, queryable text. Pretrained ML models outperform traditional rule based methods across accuracy, flexibility, and consistency on complex document structures.

New Engine — Active

Modular, Engine Agnostic Architecture

Supports multiple OCR technologies simultaneously within a single processing pipeline
DocTR prioritized for high accuracy recognition across complex layouts and multilingual content
GPU enabled processing delivers 10x to 50x performance acceleration
Selective OCR execution reduces unnecessary processing and improves throughput
Supports integration of any client preferred OCR engines where contractually required

Optimal Strategy — Active

Intelligent Selective OCR Execution

OCR applied only when it materially improves extracted text completeness
When one file in a duplicate group is processed, all remaining copies are skipped — saving approximately 30% of total computation
Selective OCR applied to embedded images when native text already exists in the document layer
Full document OCR triggered automatically when documents are scan dominant or lack usable text
Optimized balance between accuracy, performance, and computational cost at scale

Legacy Engine — Superseded

Previous Single Engine Architecture

Single engine architecture with limited flexibility across document types
Higher infrastructure overhead from CPU bound processing model
Less optimized handling of complex layouts, rotated text blocks, and multilingual content
Required full document OCR more frequently, increasing processing cost and completion time

Intuitive Document Search

Search Capability

Unified Metadata and Content Search

Familiar search experience users already know how to use — no training required
Unified search across all connected systems and document repositories simultaneously
Instantly searches titles, metadata, tags, filenames, and document properties
Advanced database level queries deliver lightning fast results even at enterprise scale
Flexible filters and sorting allow rapid refinement across any returned metadata field

Search Capability

Statistical Similarity and Version Discovery

Documents grouped by statistical similarity score, not just keyword match
Identify matching versions, prior drafts, and comparable files in a single result set
All historical document versions always appear at the top of results as the most similar documents
Supports fraud detection by surfacing near duplicate documents that differ only in critical fields

Vertical Expansion

Life Sciences — PubMed Integration

Approximately 13 million PubMed abstracts and associated data have been imported, fingerprinted, and organized by publication year. Daily updates can be processed on an hourly basis.

PubMed contains more than 39 million biomedical citations maintained by the National Center for Biotechnology Information at the US National Library of Medicine.

Ready for first Sentry academic, bioscience, biotechnology, and pharmaceutical clients.

Sentry + Document Gateway — Continuous Document and Data Readiness

Sentry registers, processes, and fingerprints every document flowing through Document Gateway — in both directions. Documents distributed externally are certified before leaving. Documents received are verified on arrival. The entire document corpus becomes a trusted, queryable intelligence layer with measurable readiness scores updated in real time. This integration is the foundation of the Continuous Transaction Readiness score.

"The fingerprint never lies. The document might. Sentry tells you the difference."

— Sentry Document Assurance design principle

Products · Tab 08

AI.DI Document Warehouse — The Data Moat That Doesn't Exist Anywhere Else

Every document Abstract.DI processes becomes structured rows in PostgreSQL. Every extracted field is queryable data. Every AI signal is persisted as a structured record. No competitor has built this. No competitor can build it without starting over.

Document Warehouse — Live Query Interface

documentgateway.ai

Click to enlarge

WAREHOUSE

AI.DI Document Warehouse — Your Entire Document Corpus as Structured Data

Warehouse · 9,857 Documents · Live Query Interface

What looks like a document list is actually a live query interface into a structured database. Every row here is not a file reference — it is a record in PostgreSQL with typed columns for every field Abstract.DI extracted: classification type, confidence score, execution status, expiry date, party names, financial terms, compliance flags, and dozens more depending on document type. The Abstract.DI query bar at the top accepts natural language — "show all agreements expiring this quarter where extraction confidence is above 90%" returns structured results because the underlying data is structured, not a keyword search across unstructured text. The six view modes (List, Gallery, Library, Cube, Time Series, Schema, Scientist) let you slice the same corpus as a compliance officer, a data analyst, an AI engineer, or a CFO — each seeing exactly the view that matches their workflow. This is the first document management system where the documents are a side effect of the real product: a continuously enriched, AI maintained structured database of everything your organization has ever received, produced, or executed.

Warehouse Studio — BI Connectors

documentgateway.ai

Warehouse Studio — Zero ETL Connections to Every Data Stack

Click to enlarge

WAREHOUSEDEVELOPERS

Warehouse Studio — Zero ETL Connections to Every Data Stack

Warehouse · Snowflake · Databricks · 9 Connectors

The Document Warehouse is not a destination — it is a source of truth that feeds every analytics system your organization already uses. Snowflake receives 891 rows on a 15-minute incremental sync via Data Share — zero copy, zero ETL, no pipeline to build or maintain. Databricks connects via Delta Lake for full and incremental refresh, enabling document intelligence to join with financial models, risk systems, and ML pipelines in the same compute environment. Webhooks fire on every document event — ingest, certify, extract, expire — enabling real time triggers to any downstream HTTP endpoint. BigQuery, Redshift, Tableau, Power BI, dbt Cloud, and a Python SDK are available with one-click configuration. The data model is fully documented — every extracted field, every confidence score, every audit event — so data engineers can join document intelligence with any other enterprise dataset without discovering the schema by trial and error. The Warehouse is not just queryable. It is the most current, most complete, most structured view of your document estate that has ever existed.

Data Intelligence — Full Data Lineage Observability

documentgateway.ai

Data Lineage Map — Complete Provenance from Source to Consumer

Click to enlarge

WAREHOUSEDEVELOPERS

Data Lineage Map — Complete Provenance from Source to Consumer

Warehouse · Data Intelligence · 15 Nodes · 17 Connections

Every piece of intelligence in the platform has a traceable origin. The Data Lineage Map visualizes the complete path a document takes from source system through ingestion, processing, warehouse storage, and consumer delivery — with live status on every node. PDF Documents (693 ingested) flow through the Ingest Pipeline (1,000 processed), Deduplication (exact + fuzzy matching), and Fingerprinting (3 fingerprint types) before Abstract.DI extracts 13 abstraction fields. Those fields become Extraction Fields (queryable warehouse layer), Document Metadata (cross asset search index), and Query Engine (PostgreSQL + custom SQL). Consumers — MCP Server, Snowflake, Webhooks — pull from the warehouse in real time. Stale nodes are visually flagged (Bridge Sync shows 0 fields synced, triggering immediate attention). This is not just observability — it is the audit trail that answers "where did this AI answer come from?" for every query, every extraction, and every alert the platform produces.

"The Finance director ran one query. Years of contract values from thousands of documents into Excel in 30 seconds. He looked up and said: 'We've been paying people to do this manually for twenty years.' That was the moment the platform sale closed itself."

— Warehouse Scientist Mode proof moment

Warehouse Studio — 7 Workspaces

Warehouse Studio is an IDE-like environment for document intelligence. A left dock switches between seven purpose-built workspaces — each designed for a different user persona and workflow. One platform serves the compliance officer, the SQL analyst, the data scientist, the AI engineer, and the executive simultaneously.

Workspace 01

Overview — Portfolio Intelligence Dashboard

Real time metrics across the document corpus: total documents, extraction coverage, anomaly count, compliance status by entity, and ingestion trends. Time period selector (7d, 30d, quarterly, annual, all-time). Coverage heatmap shows document type completeness across the hierarchy — instantly shows which entities are transaction ready and which have gaps.

Workspace 02

Query Studio — SQL and Natural Language

Full SQL editor with schema autocomplete against all extraction tables. Natural language mode converts plain English questions to SQL automatically. Saved queries, query history, and result export to CSV or JSON. PostgREST endpoint and custom SQL both available. Context panel shows live schema with field types and occurrence counts.

Workspace 03

Data Explorer — Structured Field Browser

Browse the extracted field dataset as a structured table with filtering, sorting, and column selection. Field profiler shows null percentage, distinct value count, AI confidence distribution, and mini-histogram for every field. The mv_document_universe materialized view provides a single queryable source across all document types and extraction fields simultaneously.

Workspace 04

AbstractIQ Lab — Model Configuration and Testing

Interactive extraction testing environment. Drop any document, run it through the Abstract.DI pipeline, and inspect every extracted field with its confidence score and source text reference. Adjust confidence thresholds, toggle OCR engines, and compare extraction results across model versions. Schema builder for custom document type definitions.

Workspace 05

Notebooks — Persistent Analysis and Reporting

Jupyter-style analysis notebooks with SQL and Python cells. Save analysis work as persistent notebooks shared across the organization. Scheduled execution for recurring reports. Pre-built notebook templates for common analytics: contract data analysis, financial ratio monitoring, obligation expiry ladders, and coverage gap reports.

Workspace 06

Pipelines — Sync and Automation Management

Visual pipeline builder for data sync workflows. Define extraction-to-warehouse sync schedules, connector push cadences, and conditional routing rules. Pipeline status dashboard shows last run time, row counts, error rates, and next scheduled execution. Connects the schedule-jobs and run-scheduled-reports edge functions to a visual management interface.

Workspace 07

Connectors — BI and Data Stack Integration

Manage all outbound data connections from a single workspace. Configure, test, and monitor connections to Snowflake, Databricks, BigQuery, Redshift, Tableau, Power BI, dbt Cloud, webhook endpoints, and the Python SDK. Test connection modal with live credential validation. Sync log shows row counts, timestamps, and error details per connector.

Workspace 08

AbstractIQ Chat — Conversational Document Intelligence

Natural language chat interface that queries the entire document corpus. Ask any question about your documents and receive a structured answer with source citations. Powered by the MCP server layer with full row-level security enforcement. Every answer includes the document, field, and confidence score it was derived from.

9 Outbound Connectors — Zero-ETL Data Stack Integration

Connected

Snowflake

Push extracted fields to any Snowflake schema. Upsert and append modes. 15-minute incremental sync cadence. Supports partitioned tables and schema-on-write patterns. Zero copy via Data Share — no pipeline to build or maintain.

Connected

Databricks

Delta Lake via Unity Catalog. Incremental and full refresh modes. Structured streaming support for real time pipeline integration. Used by data engineering teams embedding document intelligence into existing lakehouse workflows.

Connected

Webhooks

Push to any HTTPS endpoint on extraction events. Configurable event filters: document.ingested, document.extracted, anomaly.detected, compliance.updated, sync.completed. Retry logic with exponential backoff. Used for ERP integration and downstream automation.

Available

BigQuery

Google BigQuery streaming or batch load. Supports partitioned tables. Service account authentication. For organizations running GCP-native analytics stacks.

Available

Redshift

Amazon Redshift direct connector. COPY and INSERT modes. IAM-based authentication. Designed for AWS-native data warehouse environments.

Available

Tableau and Power BI

Tableau: live query or extract data source, connect via Tableau Server or Cloud. Power BI: DirectQuery or import dataset. Both provide immediate visualization access to the full extraction field schema without ETL.

Available

dbt Cloud

Source node for dbt models. Automatically generates YAML schema files for all extraction tables. Enables data engineering teams to build transformation models directly on top of AI-extracted document intelligence.

Available

Python SDK

pip install document-gateway. Pandas-native output — client.query("SELECT * FROM documents") returns a DataFrame directly. Async support. Used in notebooks, data science workflows, and custom analysis scripts.

Data Lineage — 5-Tier Pipeline Architecture

Every document that enters the system has a complete, traceable lineage from source through every processing stage to every downstream consumer. The interactive Data Lineage Map shows this as a graph — click any node to see its connections, status, row count, and detail. Color coding by tier. Bezier connections highlight the path from any selected node.

Tier 01

Sources

PDF, Word, Excel, Image, and other document types. Counted by MIME type from the upload table. Multiple source nodes shown dynamically based on actual corpus composition.

Tier 02

Ingestion

Ingest Pipeline (validation and routing), Deduplication (SHA-256 exact plus MinHash fuzzy), Fingerprinting (Simhash, pHash, and MinHash — 3 fingerprint types). All counts wired to live data.

Tier 03

AI Processing

Abstract.DI extraction orchestrator, Bridge Sync (JSONB to typed rows), Schema Engine (AI-learned plus user-defined field type inference). Processing count reflects completed abstractions.

Tier 04

Warehouse

Extraction Fields table (typed, indexed, queryable rows), Document Metadata materialized view (mv_document_universe), Query Engine (PostgREST plus custom SQL).

Tier 05

Consumers

MCP Server (17 tools for AI agents), Snowflake, Databricks, and Webhook connectors. Status indicators show active versus stale connections.

The Moat Is the Data Structure — Not the UI

Every competitor in the document management space stores files. AI.DI stores intelligence. The gap between those two statements is the entire moat. A corpus of 10,000 documents with 18 months of extraction history, anomaly signals, steward corrections, and financial field time-series data cannot be migrated to a competitor in any meaningful timeframe. The data structure is the lock-in — not the contract.

Products · Tab 09

Millennia FileStar — The Document Warehouse of Record

FileStar is a unified document warehouse of record that captures, verifies, and secures every critical document across all of your systems. Founded in 1996, imkore Millennia has delivered tailored document management solutions for complex enterprises across financial services, healthcare, pension administration, commercial real estate, government, and more — for nearly three decades. FileStar is the governance engine behind the AI.DI platform, providing the trusted document foundation that every other engine builds on.

The Command Center for All Your Essential Documents

A document warehouse is to documents what a data warehouse is to data — a governed, centralized environment where every critical document from across the enterprise is organized using a consistent structure, reliably searchable, and always ready for operational and analytical use. FileStar applies a deep, industry informed document taxonomy spanning more than 5,700 unique document types — covering HR, finance, legal, operations, compliance, and the full enterprise document lifecycle.

Component 01

FileStar ePort — Intelligent Document Capture

Effortlessly capture and ensure accurate classification from any application. FileStar centralizes all document types — paper or electronic — into a unified system with required fields and built in approval workflows that guarantee consistent, accurate archiving.

Scan and add documents using your own devices or multifunction peripherals
Upload single or multiple files in virtually any format with drag and drop simplicity
Email documents directly into the system for seamless capture from any source
Every PDF automatically converted to searchable format via OCR on upload
Required fields enforce completeness — nothing moves forward without the right metadata

Component 02

FileStar Workflow — Governed Approval Processes

FileStar enforces stringent controls and compliance with precision and accountability at every step. Complex workflows can be modeled to exact requirements — sequential and parallel routing, escalation paths, and automated notifications.

Seamless DocuSign integration with digital signatures built directly into workflows
Customizable automation rules streamline approvals across any business process
Mobile friendly access — approvals and routing from any device, anywhere
Comprehensive logging and reporting with full audit trails for compliance
Handles: Contract Administration, Wire Transfers, Accounts Payable, Journal Entries, Vendor Contracts, Benefit Requests, Budget Approvals, and more

Component 03

FileStar Archive — Secure Centralized Repository

Protect your critical documents in a centralized repository with security and compliance built in from the foundation. The Archive is the governed system of record — every version, every action, every access event logged and preserved.

Powerful flexible search — locate documents by type, asset, process, date, or any metadata field
Version control with full document history ensures accuracy across all revisions
Secure external sharing via trackable links with customizable expiration dates
Two factor authentication and Single Sign On for enhanced access control
Detailed system logs tracking all document access and actions for complete auditability

Open API Framework — Integration-Ready by Design

FileStar is built to live inside your existing technology stack, not beside it. The open API framework integrates directly with leading enterprise platforms so your documents flow automatically with the transactions and workflows that create them — captured, structured, and linked to source data in real time.

Integration

Enterprise ERP and Property Management

FileStar APIs connect directly with major ERP, accounting, and property management platforms. Documents are captured, structured, and linked to source data in real time — creating a continuously accurate, compliant, and retrievable document of record synchronized with the systems that generate the underlying transactions.

Integration

DocuSign — Digital Signature Workflows

Native DocuSign integration embeds digital signature workflows directly into FileStar processes. Executed agreements arrive pre-classified, pre-validated, and automatically archived into the correct location — no manual routing, no version confusion, no broken audit trail between signature and storage.

Integration

SharePoint and Enterprise Content Platforms

The SharePoint plugin allows workflows to be initiated and files to be added directly from SharePoint, making FileStar the governance and intelligence layer over existing content stores without requiring a migration. Documents in SharePoint become governed assets without leaving their existing location.

Document Warehouse — Key Concepts

Document Schema — The DNA of the Warehouse

A document schema is the structured framework that defines how documents are identified, categorized, and related to one another. Just as a data warehouse relies on a data schema to bring order to large volumes of information, a document warehouse uses a document schema to create clarity, consistency, and predictable organization across all documents.

FileStar's schema spans more than 5,700 unique document types — giving it deep understanding of documents that support acquisitions, operations, financings, compliance, and every stage of the enterprise lifecycle. The schema automatically knows what a document is, how it should be classified, where it belongs, and what a complete document chain should look like. Documents are no longer scattered or mislabeled — they are organized consistently across systems and ready for audit, operations, and enterprise-wide decision making.

Metadata Extraction — Documents Become Structured Intelligence

Metadata is to documents what structured fields are to data. FileStar identifies and extracts key attributes — document type, parties, dates, asset identifiers, and relationships — transforming unstructured files into structured intelligence. Without metadata, documents behave like raw data without schema. With metadata, they become organized, trustworthy knowledge assets that support search, governance, compliance, and AI.

Every document in FileStar is a governed asset aligned with a consistent taxonomy and storage structure — searchable through clear logical pathways by type, entity, process, source system, date, or business function. Dynamic views and dashboards give teams visibility into entire document collections, not just isolated files.

Auditability — Complete Chain of Custody

Auditability is a defining characteristic of a document warehouse. FileStar records every interaction with every document and preserves the full lineage of a record — from its originating system to every update or review. Auditors can see exactly where a document came from, how it has been handled, and whether it remains complete and accurate.

FileStar also captures the source system, timestamps, authorship, and movement of each document — creating a verified chain of custody. This transparency builds trust across the organization and satisfies regulatory requirements without additional documentation work.

Security and Compliance — Built In, Not Bolted On

FileStar operates within an SSAE 18 certified hosting facility with annual SOC II audits. Role-based access controls ensure only authorized users and groups can access specific documents. All protocols comply with HIPAA and SOX guidelines for PII and PHI.

Compliance becomes easier when documents follow a consistent structure and lifecycle. FileStar enforces rules for document retention, validation, storage, and access — providing real time visibility into document completeness, timeliness, and accuracy. This makes it simpler to prove adherence to regulations and internal policies, and reduces the risk associated with missing or misplaced documents.

Services — Document Optimization and System Transformation

Service

Document Optimization

Alignment and optimization of document workflows to ensure seamless integration, robust control, and improved productivity throughout the organization. We transform fragmented document ecosystems into a unified, cohesive framework — merging critical document silos into one streamlined system that aligns with your business objectives.

Service

Document Conversion — I:S3 Smart Scanning

From contracts to full-size drawings, whether 10,000 pages or 10 million — imkore Millennia is the trusted source for seamless document conversion. The I:S3 Smart Document Scanning Service captures the contents of boxes or file cabinets and helps organizations decide what to Shred, Store, or Scan — onsite or at the secure Chicago service bureau.

Service

System Transformation

Tailored strategies and expert guidance to unify disconnected systems into a streamlined, cohesive framework. FileStar is woven into existing ecosystems, enhancing efficiency, control, and integration across all workflows without disrupting current operations. From comprehensive assessments to implementing structured solutions — including data migrations, cleanup, and normalization.

The AI.DI Integration Pathway

FileStar as the AI.DI Governance Engine

FileStar governs documents. AI.DI makes them intelligent. FileStar managed documents automatically flow through Sentry certification and Abstract.DI extraction without any workflow change for existing users. All FileStar metadata syncs to the AI.DI Warehouse continuously.

Every FileStar client is one conversation away from the full AI.DI platform. No rip and replace. No migration project. No change management crisis. The upgrade path is a configuration change — the governance infrastructure is already in place.

Why imkore Millennia

imkore Millennia was founded in 1996 with a focus on tailored document solutions for complex requirements that standard document management software cannot easily meet. The combination of SaaS flexibility with customizable framework design means FileStar can be configured for specific industries, regulatory environments, and workflow structures without professional services for standard deployments.

SSAE 18 certified hosting facility with annual SOC II audit
HIPAA and SOX compliant protocols for PII and PHI handling
Pre-employment screening for all employees handling sensitive documents
Nearly three decades of enterprise document management expertise
Serving financial services, healthcare, pension administration, real estate, government, and more

"Our document processes were fragmented across multiple systems, making accessing information a constant challenge. With their unified framework, we now have one central platform — information is organized, accessible, and secure. Compliance has become much easier to manage, with everything traceable and stored in one place. imkore Millennia didn't just implement a solution — they transformed the way we work with our documents across the entire organization."

— Enterprise FileStar Client

Products · Tab 10

AI Orchestration & Agent Gateway — The Infrastructure That Makes LLMs Actually Work

AI.DI is not a competitor to LLMs. It is their prerequisite. Every enterprise deploying Copilot, GPT-4, Claude, or Gemini faces the same problem: the AI is only as good as the documents it reasons from. If documents are uncertified and unstructured — your AI hallucinates. AI.DI is the trusted document foundation that makes any LLM enterprise grade.

MCP Server — AI Agent Gateway

documentgateway.ai

Click to enlarge

ORCHESTRATIONAI

Integration Studio — Live AI Agent Gateway

AI Orchestration · MCP Server + Connected AI Systems

This is the screen that enterprise AI teams have been waiting for. An MCP server exposes certified tools to any MCP compatible AI system — Claude, Cursor, LangChain, AutoGen, or any agent framework. The moment Claude.ai connects to this URL, it can search your certified document corpus, check compliance status on any asset, retrieve all obligations from any document set, run structured queries against the full Warehouse, navigate your org hierarchy, and retrieve signed access URLs for specific document versions. Every query enforces row level security at the database layer — the AI agent cannot access documents the connected user is not authorized to see. Keys are revocable instantly. Usage is logged. This is not middleware or a wrapper — it is a purpose built enterprise document intelligence API that treats your LLM as a trusted, auditable consumer of certified data rather than a summarizer of raw PDFs.

28 Connectors — Every System You Already Use

documentgateway.ai

Click to enlarge

ORCHESTRATIONAI

Integration Studio — 28 Enterprise Connectors

AI Orchestration · Full Connector Ecosystem

The platform connects to every system an enterprise already runs — which means "we already use X" has no purchase as an objection. Enterprise ERPs push operational documents, financial reports, and contract records directly into the AI.DI ingestion pipeline on a configured schedule or in response to events — contracts, invoices, compliance filings, and amendments arrive as first-class pipeline records rather than email attachments or manual uploads. Document management connectors (SharePoint, Google Drive, Box, OneDrive, Dropbox) make AI.DI additive: it reads, certifies, and extracts from existing storage without requiring a file migration. CRM platforms deliver agreements and correspondence as structured ingestion records. Observability and monitoring tools push operational documents as they are generated. Data warehouse connectors deliver extracted intelligence outbound to Snowflake, Databricks, BigQuery, and Redshift on configurable schedules. Every connection is configured through a guided wizard — no custom code, no IT project, no services engagement required.

Document IQ — AI Powered Portfolio Intelligence

documentgateway.ai

Document IQ — Conversational AI Over a Certified Corpus

Click to enlarge

ORCHESTRATIONAI

Document IQ — Conversational AI Over a Certified Corpus

AI Orchestration · Portfolio-Wide AI Query Interface

What changes when AI reasons from a certified, structured corpus instead of raw files is not incremental — it is categorical. "What is missing from the vault?" is not a keyword search across folder names. It is a completeness calculation running against required document schemas for every entity in scope simultaneously, returning a ranked gap list with entity, document type, and days since last receipt. "Show critical risk items" is not a tag filter — it is an aggregation of violation flags, expiry warnings, anomaly detections, and compliance alerts across the entire corpus, sorted by risk magnitude. Upload any document and Document IQ cross-references it against vault records: matching party names, flagging version discrepancies, identifying superseded agreements, and surfacing every related obligation that touches the same entities. The difference between this and asking a general-purpose AI to analyze your documents is the difference between querying a continuously maintained structured database and asking someone who once read that database to remember what was in it. Trusted structure underneath the model is what makes every answerrtfolio into a single prioritized view that would take a compliance analyst days to compile manually. The file upload capability takes any document — a counterparty critical data extract, a vendor certificate, a financial statement — and cross references it against vault records in real time, identifying discrepancies, missing correlations, and data conflicts without a human pulling comparison reports. This is the AI experience that becomes the reason nobody opens a legacy platform again.

Data Lineage — Full Provenance for Every AI Answer

documentgateway.ai

Click to enlarge

ORCHESTRATIONWAREHOUSE

Data Intelligence — Data Lineage Map

AI Orchestration · End to End Data Provenance

When an AI agent answers a question using AI.DI data, every element of that answer has a traceable origin. The Data Lineage Map shows the complete pipeline from source document to consumer — enabling any data engineer, compliance officer, or auditor to trace exactly how a specific piece of intelligence was produced, what transformations it passed through, and which source document it ultimately came from. This is the infrastructure that eliminates LLM hallucination risk: every answer the AI returns is backed by a certified document, a specific extraction, a confidence score, and a provenance chain. The stale node indicators (Bridge Sync showing 0 fields synced) surface data freshness issues proactively — you know before an AI answer is delivered whether the underlying data is current. Provenance is not an afterthought in AI.DI. It is the foundation.

MCP Server — 17 Certified AI Agent Tools

The AI.DI MCP Server serves dual protocols simultaneously: the Model Context Protocol for Claude and Cursor, and REST/OpenAPI for ChatGPT GPT Actions and any HTTP-capable agent framework. A single endpoint. Two protocol surfaces. All 17 tools available on both. Row-level security enforced at the PostgreSQL layer — the AI agent cannot access documents the authenticated user is not authorized to see.

Tool Category 01

Document Search and Retrieval

search_documents — full-text and metadata search across the certified corpus with relevance scoring
get_document — retrieve a specific document record with all extraction fields and a signed storage URL
list_documents_by_type — return all documents of a given classification type, filtered by entity or date range
get_document_versions — retrieve complete version history for any document including diff metadata
find_similar_documents — Sentry similarity search returning documents ranked by fingerprint distance

Tool Category 02

Extraction and Compliance Queries

get_extracted_fields — return all AI-extracted fields for a document with confidence scores and source references
query_extraction_fields — structured query against the extraction fields table with filters on any typed column
get_compliance_status — return the compliance posture for any entity: required documents, present, missing, expired
get_anomaly_flags — list all AI-detected anomalies across a specified entity or corpus scope
check_document_expiry — return documents expiring within a specified time window across any scope

Tool Category 03

Portfolio and Warehouse Intelligence

get_asset_hierarchy — navigate the org hierarchy from enterprise to entity level
warehouse_query — execute arbitrary SQL against the document warehouse with result pagination
get_ctr_score — retrieve the Continuous Transaction Readiness score for any entity or portfolio
get_portfolio_summary — aggregate document intelligence across a division or portfolio scope
list_extraction_schema — return the full field schema with types and occurrence counts for any document type
get_data_lineage — return the processing history of any document from ingest through warehouse
get_warehouse_metrics — return ingestion counts, extraction rates, and anomaly statistics for any time period

Edge Functions Powering the Orchestration Layer

agent-gateway — Intelligent Request Router

The agent-gateway edge function receives all AI agent requests and dispatches them to the appropriate tool handlers. It enforces authentication, validates the requesting agent's access scope, applies row-level security policies, and logs every tool invocation for the audit trail.

Supports Bearer token authentication for API clients and session-based auth for browser-connected agents. Rate limiting per API key. Tool-level permission grants — a key can be scoped to read only document retrieval without access to warehouse queries or compliance data.

mcp-server — Dual Protocol Gateway

A single Supabase Deno edge function serving both the Model Context Protocol (SSE transport for Claude and Cursor) and a REST/OpenAPI interface (for ChatGPT GPT Actions, LangChain, AutoGen, and any HTTP agent).

The same tool definitions, the same security model, the same data — two protocol surfaces from one deployment. ChatGPT integration operational. Deployed with --no-verify-jwt to support custom Bearer token auth independent of Supabase session auth.

erp-webhook — Inbound ERP Event Handler

Receives inbound webhook events from enterprise ERPs, CRM platforms, and any connected system. Validates payload signatures, routes events to the appropriate pipeline stage, and triggers document processing or metadata updates without human involvement.

When a contract is executed in an ERP, the erp-webhook fires the checkin-pipeline automatically — the document enters the AI extraction queue without anyone touching Document Gateway directly.

schedule-jobs + run-scheduled-reports — Autonomous Operations

Cron-triggered orchestration functions that run batch operations on a configurable schedule. Batch pipeline runs process large document queues during off-peak hours. Scheduled reports generate and distribute compliance summaries, expiry alerts, and portfolio intelligence reports automatically.

No human trigger required for ongoing operations. The platform monitors itself, processes new documents, updates CTR scores, and delivers reports on schedule — continuously.

Webhook Event Architecture

Event Type

document.ingested

Fires when any document completes the ingest pipeline — validation passed, stored, and queued for extraction. Payload includes document ID, file type, entity node, and submitter identity. Triggers downstream ERP updates or data warehouse prestaging.

Event Type

document.extracted

Fires when Abstract.DI completes field extraction on a document. Payload includes all extracted fields, confidence scores, and the document's workflow routing decision. Primary trigger for downstream analytics pipelines.

Event Type

anomaly.detected

Fires when Abstract.DI flags an extracted value as anomalous relative to corpus patterns. Payload includes the anomaly type, affected fields, expected range, and actual value. Used for real time alerting to portfolio managers or risk systems.

Event Type

compliance.updated

Fires when a document's compliance status changes — new document received completing a required set, document expiry approaching threshold, or outstanding obligation resolved. Triggers CTR score recalculation and stakeholder notifications.

Event Type

sync.completed

Fires when a connector sync run completes — Snowflake push, Databricks batch, or webhook batch delivery. Payload includes row counts, error counts, and sync duration. Used to confirm data freshness in downstream BI tools.

Config

Webhook Security Model

All outbound webhooks signed with HMAC-SHA256. Receiving endpoint validates signature before processing. Configurable per-event filtering. Retry logic with exponential backoff on 4xx and 5xx responses. Full delivery log available in the Connectors workspace.

Security, Audit, and Compliance Architecture

Row-Level Security — Database Enforced

Access control is not application-layer middleware. Every Supabase table has PostgreSQL row-level security policies that enforce which rows a given user can read, write, or delete — based on their role, their organization, and their specific entity permissions.

An AI agent authenticating with an API key receives exactly the same data access as the human user who created that key — not more, not less. Even if the agent constructs a warehouse query attempting to access data outside its scope, PostgreSQL silently returns only authorized rows. The restriction is invisible to the caller and unbypassable by any query construction.

API Keys, Audit Log, and Revocation

Every API key is scoped to a specific user, organization, and permission set at creation time. Keys can be restricted to specific tools, specific entities, or read only operations.

Instant revocation — key disabled at the database layer, all in-flight requests rejected immediately
Full audit log on every tool invocation: timestamp, user, tool name, parameters, result row count, latency
Usage analytics per key: call volume, top tools, error rates, and data volume
Key expiry with configurable TTL for time-limited integrations or contractor access
GDPR, HIPAA, SEC, and APA compliance maintained through architecture — no configuration required

AI.DI Is Not a Competitor to LLMs — It Is Their Prerequisite

Every enterprise deploying Copilot, GPT-4, Claude, or Gemini on their documents faces the same problem: the AI is only as good as the data it reasons from. Uncertified documents produce hallucinated answers. Unstructured files produce generic summaries. AI.DI is the certified, structured document foundation that transforms any LLM from a document summarizer into a reliable enterprise intelligence system.

Value & Strategy · Tab 11

Continuous Transaction Readiness™ — The Score That Disrupts a Category

CTR is not a feature. It is a category-defining concept that legacy document management platforms are structurally incapable of delivering. It means your organization is always prepared to respond to a capital call, close an acquisition, satisfy a regulator, onboard a counterparty, or distribute to a stakeholder — because AI.DI monitors, scores, routes, and maintains your entire document estate continuously, automatically, and in real time.

The Primary Value Statement

AI.DI gives your organization Continuous Transaction Readiness — the state where every document across every system is accessible, authentic, current, and actionable at all times. Organizations that achieve this state lower their cost of capital, reduce audit risk, accelerate transactions, deploy AI with confidence, and eliminate the document scramble that precedes every critical business event.

Why CTR Is Disruptive — What the Legacy Platforms Cannot Do

The Legacy Problem

Document Management Platforms Are Reactive. CTR Is Proactive.

Every document management platform ever built — M-Files, Hyland, Box, SharePoint, Laserfiche, OpenText — operates on the same model: a human asks a question, the system returns a file. The documents do not know they are incomplete. The system does not know a transaction is approaching. No one is told what is missing until the moment it matters.

CTR inverts this model. The platform continuously monitors the entire document estate against a dynamic requirement model, scores readiness in real time, and surfaces gaps before they become crises. The difference between reactive retrieval and proactive readiness is the difference between document management and document intelligence.

The Structural Barrier

CTR Requires Intelligence That File Storage Systems Cannot Generate.

To calculate a CTR score, you need to know: which documents are required, which are present, which are valid, which are current, which have changed, and which are expired. A file storage system knows none of this. It knows filenames and folder paths.

AI.DI knows all of this because Abstract.DI has read every document, Sentry has fingerprinted and certified every document, and the Warehouse stores every extracted field — including expiry dates, version identifiers, compliance flags, and obligation terms — as queryable structured data. CTR is computed from that data continuously. No competitor has that data. None can build it without starting over.

The Market Opportunity

Every Organization Has a Transaction in Its Future. None of Them Are Ready.

Every organization faces recurrent high-stakes document events: regulatory audits, financing processes, M&A due diligence, partner onboarding, contract renewals, compliance filings, board reviews. In every case, the weeks before the event are consumed by document scramble — finding files, verifying versions, hunting for missing certificates, correcting outdated records.

CTR eliminates that scramble permanently. The organization is ready before the event is announced. That is not an incremental improvement. It is a fundamentally different value proposition — one that no existing platform can match because none of them understand what their documents say.

How CTR Is Calculated — Five Weighted Dimensions

Sample Entity CTR Score

84/100

Near Ready — Minor Gaps

23/26

Docs Present

Expiring Soon

Violation

4.2d

Avg Response

Five Weighted Dimensions

Document Completeness88/100

23 of 26 required document types present and valid across this entity

Document Validity & Freshness76/100

2 regulatory certification documents expire within 45 days — alerts dispatched

Compliance & Regulatory Status71/100

1 active violation: Compliance Certificate version mismatch detected by Sentry fingerprint comparison

Distribution Readiness92/100

Complete document package deliverable to any counterparty within 2 hours from current state

Access & Permissioning Health97/100

All role assignments current. No orphaned access detected. Every stakeholder sees exactly what they should.

Score Interpretation

Score	Status	Typical Situation	Time to Transact
90–100	Transaction Ready	All documents present, current, and certified. No violations. Counterparty package deployable in hours.	48 hours
75–89	Near Ready	1–3 documents missing or expiring. No active violations. Gaps identified and assigned.	1–5 business days
55–74	Attention Required	Multiple gaps or 1–2 violations. Transaction possible but counterparty will surface issues.	2–4 weeks
35–54	Not Ready	Significant document gaps. Will not survive regulatory or counterparty diligence in current state.	30–60 days
0–34	Critical	Severely incomplete or noncompliant. Immediate remediation required across multiple dimensions.	90+ days

What CTR Delivers — Tangible Organizational Outcomes

Outcome 01

Transactions Close Faster

The typical document scramble before a financing, acquisition, or regulatory filing takes 3 to 6 weeks. Teams chase files across drives, email chains, and vendor portals. Half the documents retrieved are wrong versions. AI.DI eliminates this entirely. A CTR score of 90+ means the counterparty package is ready before the counterparty asks.

Outcome 02

Audit Risk Drops to Near Zero

Regulators and auditors request specific documents with specific version requirements. AI.DI maintains a continuous, certified audit trail on every document — version history, access log, fingerprint certification, and extraction record. When the auditor requests a document from 18 months ago, the platform produces it in seconds, certified, with full provenance chain.

Outcome 03

AI Deployments Actually Work

Every major enterprise AI deployment is failing for the same reason: the documents feeding the model are unverified, duplicated, and structurally inconsistent. AI.DI solves this permanently. When your LLM reasons from AI.DI-certified documents, every answer is backed by a fingerprint-verified, extraction-validated, version-controlled source.

Outcome 04

Compliance Is Continuous, Not Cyclical

Most organizations achieve compliance for a moment — the audit, the filing deadline, the renewal date — then drift back into gaps. AI.DI makes compliance a continuous state, not a periodic sprint. Expiry alerts fire 90, 60, and 30 days before a document lapses. The CTR score reflects the current compliance posture at all times.

Outcome 05

Cost of Capital Improves

Lenders, investors, and ratings agencies price risk based in part on how prepared an organization is to respond to information requests. Organizations that can deliver complete, certified, structured document packages in hours demonstrate operational maturity that translates directly into better terms. The CTR score is a quantified, auditable measure of that maturity.

Outcome 06

The Document Scramble Is Eliminated Permanently

Every organization knows the document scramble: the all-hands search that precedes every critical business event. It is expensive, error-prone, and entirely avoidable. AI.DI eliminates it by maintaining Continuous Transaction Readiness as a permanent operational state. The organization does not prepare for the transaction. The organization is always prepared.

The CTR Competitive Displacement Framework

Every legacy document management platform can be evaluated against a single question: can it tell you, right now, whether you are ready to transact? The answer is universally no — because transaction readiness requires knowing what your documents say, not just where they are stored.

Capability	M-Files / Hyland / OpenText	Box / SharePoint	AI.DI
Real time readiness score	None	None	CTR Score — continuous
Automatic gap detection	Manual checklist	None	Continuous AI monitoring
Document content intelligence	Metadata tags only	None	Full field extraction
Expiry and validity tracking	Manual with reminders	None	Automated from extracted dates
Counterparty package readiness	Manual assembly	Manual assembly	Pre-assembled, certified
Compliance posture visibility	Periodic reports	None	Continuous, real time
AI-ready data foundation	Raw files only	Raw files only	Certified structured data
Version certification	Version numbers only	Version numbers only	Sentry fingerprint certified

"The question every organization needs to answer — and currently cannot — is: are we ready? AI.DI is the first platform that answers that question continuously, automatically, and with mathematical precision. CTR is not a score. It is proof that document intelligence has replaced document management."

— AI.DI platform design principle

Value & Strategy · Tab 11

For Data Scientists — The Document Intelligence Stack You've Been Waiting For

You've been asked to build AI on enterprise documents. You know what that means: unstructured PDFs, no provenance, wrong versions, 40% duplicates, PII everywhere, no reliable way to trace an LLM answer to a specific document. AI.DI is the infrastructure layer that solves every one of those problems — through every interface you already use.

What You're Actually Getting

AI.DI is not a document management UI with an API bolted on. It is a document intelligence data platform: a PostgreSQL warehouse of structured document intelligence, a MCP server, a webhook event stream, a REST/GraphQL API, Snowflake Data Share, JDBC/ODBC direct access, vector embeddings on certified document chunks, and a 30-engine ML pipeline that improves continuously. Every document becomes structured, provenance tracked, certified data — available to any model, pipeline, or analytics tool you're running.

The Data Model — What You're Querying

Table	Contents	Key Fields	Primary Use
`document_records`	Every document processed	id, original_name, document_type, workflow_status, asset_id, classification_confidence, storage_path	Document inventory, classification analysis
`extracted_fields`	Structured extraction from Abstract.DI	document_id, field_name, field_value, confidence_score, extraction_model, extraction_timestamp	Contract analytics, financial extraction
`sentry_fingerprints`	Cryptographic fingerprint records	document_id, fingerprint_hash, fingerprint_type, certified_at, version_chain, similarity_scores	Certification, duplicate detection, fraud monitoring
`hierarchy_nodes`	Full org hierarchy	id, parent_id, node_type, node_name, industry, ctr_score, completeness_pct	Portfolio analytics, CTR aggregation
`document_activity_log`	Every action on every document	document_id, event_type, actor_id, actor_role, timestamp, metadata	Audit trail, access pattern analysis
`vector_embeddings`	Embeddings on certified chunks	document_id, chunk_id, certified_version_hash, embedding_vector, model_version	Semantic search, RAG retrieval, clustering
`ctr_score_history`	CTR Score time series	node_id, score, dimension_scores, calculated_at, delta_from_prior	Readiness trending, portfolio benchmarking

Python SDK — Example Patterns

from aidi import DocumentWarehouse

client = DocumentWarehouse(api_key="YOUR_KEY", tenant_id="YOUR_TENANT")

# Query all Q1 2027 lease expirations across a portfolio — certified docs only

expirations = client.extractions.query(

  document_type="commercial_lease", field="expiration_date",

  date_range=("2027-01-01", "2027-03-31"), certified_only=True

)

# Get version-locked embeddings for RAG pipeline

embeddings = client.vectors.get_certified_chunks(

  document_ids=expirations.document_ids(), version_locked=True

)

# Subscribe to certification events for real time model retraining

@client.events.on("document.certified", document_type="financial_statement")

async def on_new_financial_statement(event):

  extracted = await client.abstractions.get_fields(event.document_id)

  await my_model.retrain_incremental(extracted.to_feature_vector())

Value & Strategy · Tab 12

Any Industry. Any Complexity. Built for Scale.

AI.DI was not built for one vertical and adapted for others. The same engine that certifies a Blackstone real estate portfolio is equally compelling for a PE firm's data room, a hospital's compliance records, or a law firm's contract vault. The document problem is universal. So is the solution.

Unlimited Org Depth

Enterprise → Group → Entity → Asset → Unit → Counterparty. Any depth, any width, any industry. A 500-asset institutional fund, a 15-portfolio-company PE firm, a 200-branch bank — all map to the same hierarchy model with zero configuration overhead.

Edge Compute at Scale

Deno edge functions scale to zero when idle and to any volume on demand — same code handles 10 documents and 10 million. No ops team. No provisioning. No performance cliffs at scale.

Any File Type. Zero Exceptions.

PDF, DOCX, XLSX, PPTX, MSG/EML, CSV, ZIP, JPEG/PNG/TIFF scans, database records. No conversion required. No preprocessing. Whether a scanned fax or a native Word contract — AI.DI ingests, classifies, and extracts from all of it.

any industry

Asset managers, sponsors, operators across multifamily, office, industrial, retail, and mixed-use

Primary Market

Document Types

Title policies & ALTA surveys
Lease abstracts & leases
Insurance certificates
Environmental studies (Phase I/II)
Appraisals & BPOs
Loan documents & notes
Certificates of occupancy
Property management agreements

Key Use Cases

Acquisition due diligence
Loan closing packages
Lender covenant compliance
Insurance renewal management
Portfolio disposition readiness
LP reporting distributions

CTR Impact

Diligence prep: 6 weeks → 48 hours
Eliminate insurance gap incidents
Pre-qualify assets 12 months early
LP reports in 1 click, not 2 weeks
Close refinancings in half the time

Private Equity

GPs, fund managers, and portfolio operations teams managing company level and fund-level documentation

Primary Market

Document Types

Fund formation documents
LP subscription agreements
Cap tables & equity agreements
Material contracts
Audited financials
Board minutes & resolutions
Exit transaction documents

Key Use Cases

Portfolio company exit readiness
LP capital call packages
Annual audit preparation
Co-investor reporting
Secondary transfer docs

CTR Impact

Exit prep starts 18 months early
LP Q&A response under 24 hours
Audit cycle cut 70%
Deal team on diligence, not hunting docs

Department-Level Entry Points

Department	Acute Pain	AI.DI Entry Product	Expansion Path
Legal / GC	Contract version disputes, discovery liability, GDPR compliance	Sentry certification + Document Gateway distribution	Full Document Warehouse for corporate legal corpus
Finance / Accounting	Audit prep fire drills, financial document reconciliation	Abstract.DI batch (financial extraction) + Blueprint audit	Sentry certification + Warehouse integration to ERP
Compliance / Risk	Regulatory filing tracking, compliance gaps, audit exposure	Sentry + Warehouse (compliance corpus) + CTR Score	Full platform across regulated document types
Transactions / Deal Team	Due diligence prep time, data room chaos	Document Gateway + Distribution Studio + Transaction Rooms	Abstract.DI batch for portfolio wide extraction
IT / Data Engineering	Unstructured data not in Snowflake; LLM hallucinations	Document Warehouse + Snowflake + MCP Server	Full platform as enterprise document intelligence backbone
Operations / HR	Employee records, policy tracking, onboarding compliance	FileStar lifecycle governance + Abstract.DI HR extraction	Sentry certification + Document Gateway policy distribution

Get Started · Tab 13

Start with One Department. Get the Whole Platform.

AI.DI is not a pilot program with limited features. From your very first document, you have access to the complete platform — every engine, every view, every integration. We believe in earning your full commitment by delivering full capability from day one. Start small if you want. The platform is built for all — enormous portfolios and single-department deployments run on the exact same infrastructure.

The imkore Philosophy — Do Some or Do It All

The world's largest institutional real estate portfolios run on the same platform as a 12-asset regional operator starting their first compliance program. A single compliance officer in one department gets the same AI intelligence, the same CTR Score, the same Warehouse, the same MCP server as a 500-person investment management firm running 20 funds. We built for scale from day one — which means the smallest client gets the most powerful platform available at any price point. No feature tiers. No locked capabilities. No "upgrade to get the real thing."

Three Ways to Start — All Paths Lead to the Same Platform

Entry Path 01

Start with One Document Type

Pick your most painful document type — insurance certificates, leases, vendor contracts, financial statements. Run Abstract.DI on everything you have. Get a CTR Score on that category in 72 hours. See exactly what's missing, expiring, or wrong. The rest of the platform is right there when you're ready.

"We started with just our COIs. In three days we knew which assets were exposed. We hadn't done that audit in two years." — Property Operations Director

Entry Path 02

Start with One Department

Give legal, compliance, finance, or your deal team a standalone deployment. They get the full platform — just scoped to their hierarchy node and document types. No IT project. No enterprise rollout required. One steward, one asset group, full capability. When they prove ROI, the next department asks to join.

"Legal started it. Then finance wanted in. Then the deal team. We never ran a rollout — it spread itself." — Chief Operating Officer, PE Firm

Entry Path 03

Start with One Asset or Fund

Run a complete AI.DI deployment on a single asset or fund as a proof of concept with real production data. CTR Score goes live in 72 hours. Abstract.DI processes your existing archive in the first week. Distribution Studio sends your first LP package before the end of month one.

"We ran one asset. The CTR Score told us things we didn't know. That asset closed three months faster. Then we did the whole portfolio." — Managing Director, Enterprise Client

imkore Blueprint — The Highest-Confidence Entry Point

Advisory Service · $50K to $150K · 60–90 Days

imkore Blueprint — Document Intelligence Audit & Readiness Roadmap

Blueprint evaluates your entire document ecosystem — every repository, every system, every process — and delivers a scored readiness assessment and a prioritized AI.DI product roadmap. Blueprint invariably reveals exactly which products the client needs and why. The roadmap we deliver IS the AI.DI implementation plan for your organization.

Discover

Map all document repositories across all systems

→

Assess

Evaluate governance, structure, and integrity

→

03-04

Classify + Validate

Standardize taxonomies, confirm authenticity, remove duplicates

→

05-06

Structure + Certify

Apply metadata conventions, establish certified records

→

07-08

Enable + Optimize

Prepare for AI and automation, maintain Continuous Transaction Readiness

Plans

Tier 01

Foundation

For teams managing one department, fund, or asset group getting organized and transaction ready for the first time.

Custom

Based on asset count and user seats.

+Up to 25 assets — full platform

+Check-In Studio with Abstract.DI

+CTR Score Dashboard

+Distribution Studio

+Up to 10 user seats

Contact Sales