Enterprise RAG · UKBid & Commercial~500 internal usersSharePoint-permissionedClient under NDA

Corporate RAG — a permission-scoped knowledge platform for ~500 internal users

A UK building repair, maintenance and refurbishment group (under NDA) ran their bid and commercial work on terabytes of past projects scattered across SharePoint. Estimators routinely spent ~15 minutes hunting for the one analogous job that would let them scope and price a new client request. We built a permission-scoped RAG platform on AWS Kendra, Bedrock and OpenFGA that turned that into a question and an answer. Then the rest of the business found it.

~150×Faster doc search
~500Internal users
TBIndexed corpus
1Permissioned source of truth

Client

A UK building repair, maintenance and refurbishment group (under NDA). Project delivered for the bid & commercial department, with role-based access for ~500 internal users across departments.

Engagement

Discovery → PoC → full internal portal → expansion into broader internal use cases. Delivered as a self-hosted platform on the client's AWS estate (eu-west-1 / eu-west-2).

Terabytes of project history, none of it findable when a bid was on the clock.

The bid team is responsible for scoping and pricing repair, maintenance and refurbishment work across the UK. Every new request from a client is, in practice, a question against the firm's own history: have we done this type of building before, in this region, at this scale — and what did it cost and how did it go? The answer was always somewhere in SharePoint: in tender packs, scopes of work, site visit notes, completion certificates, meeting summaries, letters, photo packs. The estimators just couldn't find it inside the time budget of a bid.

Two constraints turned this from "ship a RAG demo" into a real engineering problem. First: permissions. Not every estimator can see every project. SharePoint already encoded an ACL graph by department, role, and folder — and the platform had to respect it. A RAG system that returns content the user wasn't allowed to see is not a productivity tool; it's a data incident. Second: provenance. A pricing decision based on a hallucinated past project is worse than no answer at all. Every claim had to point back to a specific document at a specific page in SharePoint, behind the same login the user already uses.

Permissions before retrieval. ReAct agent with CRAG grading. Schema-typed answers with SharePoint citations.

Five layers, each tied to a specific failure mode of "RAG over SharePoint" in real enterprise life.

Identity and authorization, not just an API key. Every request enters through Keycloak SSO (the client's existing IdP). On top of Keycloak we layered OpenFGA for fine-grained authorization: users belong to departments, departments are granted access to applications and document scopes, and roles (admin, member, super_admin) gate write vs. read. Retrieval is filtered before the LLM sees anything — Kendra results are intersected with the FGA decision graph for the current user.

Structured ingestion of real-world office documents. Past projects live as PDFs, PPTX slide decks, DOCX tender responses, XLSX cost models, and email archives. We built an AWS Lambda ingestion pipeline with dedicated layers per format (pdf-parse / pdf-lib for PDF, node-pptx-parser for slides, LibreOffice for legacy formats, an XLSX layer for cost models). Documents land in S3, are parsed into Kendra with metadata preserved (department, project ID, document type, SharePoint URL, owner, last-updated date), and stay re-syncable from SharePoint.

Retrieval as a ReAct agent, not a single embedding call. Every question flows through an intent classifier (knowledge query, small talk, clarification, off-topic, meta, ambiguous) and a query analyzer that scores complexity, extracts entities and identifies temporal references. Complex multi-part questions are routed through a query decomposer that emits ordered sub-queries with dependency types (sequential, conditional, contextual). The ReAct agent then runs thought–action–observation loops over Kendra, with a CRAG-style document grader that labels each candidate highly_relevant / partially_relevant / not_relevant and either proceeds, rewrites the query and retries, or returns nothing.

Schema-typed answers with inline citations. Generation runs on AWS Bedrock (Anthropic Claude Sonnet, eu-west-1) via the Vercel AI SDK. The response is a Zod-typed object with two fields: aiAnswer — text with inline citation markers [1] [2] plus a references array of (title, SharePoint URL, page number) — and documentResults — the supporting documents with type (MSA / SOW / Contract / Proposal / Report), owner, last-updated date and relevance percentage. The UI renders citations as clickable SharePoint links that respect the same Keycloak session, so provenance is one click away, behind the same authorization the user already has.

Quality guards on the way in and on the way out. A prompt-injection / input-sanitization layer guards the prompt path. After generation, a response validator and a hallucination checker cross-reference each claim against the retrieved evidence and grade grounding per claim. The whole pipeline is wired into a promptfoo eval harness — 30+ use cases (document discovery, email summarise / respond / draft, SOW creator, business case writer, social posts, press releases, etc.) with rubric-as-code LLM-judge assertions running on every change.

Architecture (data flow)

1.IdentityKeycloak SSO → OpenFGA decision (user × department × application × document scope)
2.IngestionSharePoint → S3 → AWS Lambda layers (pdf-parse, pdf-lib, node-pptx-parser, xlsx, LibreOffice)
3.IndexAWS Kendra (eu-west-2) — metadata: department, project, doc type, SharePoint URL, owner
4.Intent6-way classifier → knowledge_query / small_talk / clarification / off_topic / meta / ambiguous
5.AnalyzeQuery complexity + named-entity + temporal extractor
6.DecomposeMulti-hop questions → sub-queries with dependency types (sequential / conditional / contextual)
7.Retrieve + GradeKendra retrieval ∩ FGA-allowed → CRAG grader (relevance + confidence) → proceed / rewrite / refuse
8.ReAct loopAgent decides: search / rewrite / grade / generate, with bounded iterations
9.GenerateBedrock Claude Sonnet (eu-west-1) via Vercel AI SDK → Zod schema (aiAnswer + references + documentResults)
10.ValidatePrompt-injection guard · response validator · hallucination checker · promptfoo CI gates
AWS Bedrock (Claude Sonnet) AWS Kendra OpenFGA Keycloak SSO NestJS 11 AWS Lambda AWS S3 AWS SES Vercel AI SDK Prisma + MongoDB Zod schemas promptfoo React 19 + Vite LibreOffice md-to-pdf · remark-docx · pdf-lib Jira API TypeScript Docker · ECR

Document search collapsed from ~15 minutes to seconds, on a permissioned source of truth that the rest of the business asked for next.

~150×

Faster document search

From ~15 minutes of folder-spelunking to a sourced answer in seconds — measured against the estimator workflow the platform replaced.

~500

Internal users

Role-based access across the bid & commercial department and adjacent functions. Every retrieval is scoped through OpenFGA against the user's department membership and document grants.

TB

Indexed corpus

Tender packs, SOWs, contracts, proposals, reports, meeting summaries, letters and email archives — kept in sync with SharePoint, surfaced through a single search and chat surface.

30+

Use cases shipped

Document discovery was the wedge. Email summarise / respond / draft, SOW creator, business case writer, task estimator, meeting-minute refiner, press releases, social posts, and Jira ticket creation followed.

0

Permission bypasses

Retrieval is intersected with the FGA decision graph before the LLM sees a chunk. Users see only what they were already allowed to see in SharePoint.

CI

Eval gates on every change

promptfoo runs LLM-judge rubric assertions across the use-case library on every pipeline change. Quality regressions are caught before they ship.

Most "enterprise RAG" demos die the moment real permissions show up. Building this platform was the opposite engineering problem from the public RAG benchmarks: identity, authorization and provenance came first, and the retrieval architecture had to live inside that constraint. That's where the value lives.

— Viktor Andriichuk, Founder, DataFlux Software

Five decisions that made the platform survive contact with the actual business.

1. Authorize before you retrieve. OpenFGA isn't an afterthought layer over a search index — it's the gate that decides what Kendra is even allowed to return for this user. The decision graph mirrors SharePoint's department and folder ACLs, so the legal/HR question of "who can see what" never moves into the LLM's prompt.

2. Treat the document estate like a system, not a folder. Real enterprise corpora are PDFs, PPTX decks, DOCX tenders, XLSX cost models, .msg email archives and image-only scans. Dedicated Lambda layers per format (including LibreOffice for legacy conversions) gave us a unified pipeline into Kendra without losing tables, slide notes, or page hierarchy.

3. ReAct with CRAG, not one-shot retrieval. A document grader that can say "this is partially relevant, try a rewrite" beats a higher top-k and a longer context window. Most production failures we've seen are retrieval failures pretending to be generation failures.

4. Schema-typed responses with clickable provenance. A Zod schema that requires aiAnswer + references[] + documentResults[] with SharePoint URLs forces every claim to point somewhere a human can verify. That's also what made bid & commercial trust the system enough to actually use it.

5. Use cases as a portfolio with CI gates. Document discovery was the wedge but not the destination. promptfoo evals on every prompt/use case meant we could add SOW generation, email composition, Jira ticket creation, social posts and meeting-minute refining without quietly regressing search quality.

Discovery → PoC → platform → expand.

We've built this platform once. We can build it for you.

Bring us your document estate, your auth provider and the workflow you're trying to compress. We'll come back with the architecture, the eval set we'd run against it, and what it costs to ship and run.