EAAPL-RAG009Proven

Graph Retrieval-Augmented Generation

🔍 Retrieval-Augmented GenerationEU AI ActISO/IEC 42001

[EAAPL-RAG009] Graph Retrieval-Augmented Generation

Category: Artificial Intelligence / Retrieval-Augmented Generation Sub-category: Knowledge Graph and Structured Retrieval Version: 1.1 Maturity: Emerging Tags: rag knowledge-graph graph-traversal entity-linking neo4j sparql cypher explainability multi-hop Regulatory Relevance: EU AI Act Article 13 (Explainability via graph reasoning paths), ISO/IEC 42001 Section 8.4, APRA CPG 234 (explainable AI in regulated decisions)

1. Executive Summary

Graph RAG combines vector-based semantic retrieval with structured knowledge graph traversal to answer complex, relationship-heavy questions with explainable reasoning paths. Where standard vector RAG retrieves semantically similar text passages, Graph RAG traverses explicit entity relationships to assemble contextual evidence — enabling answers to questions like "Which clinical trials reference the same molecular pathway as Drug X, and which of our researchers have published on that pathway?" that require following chains of typed relationships across a knowledge graph.

For enterprise leaders, the key differentiator of Graph RAG is not just retrieval breadth but explainability: every answer can be traced through a graph traversal path that names the entities and relationships used, providing the kind of structured reasoning chain that regulators, auditors, and domain experts can inspect and validate. In regulated industries where AI decisions must be explainable, a graph reasoning path ("Drug X → inhibits → Pathway Y → implicated_in → Disease Z → treated_by → Drug A") is far more auditable than "these five text passages were retrieved and the LLM generated this answer." Graph RAG represents the frontier of enterprise AI knowledge infrastructure — requiring significant investment in knowledge graph construction and maintenance, but delivering retrieval quality and explainability that vector-only approaches cannot match.

2. Problem Statement

Business Problem

Multi-hop relational queries — questions that require following chains of relationships between entities — cannot be answered by vector similarity search because the relevant documents may not be semantically similar to the query; they are structurally connected through a chain of relationships. An insurance underwriter asking "Which of our policyholders have risk factors associated with the newly classified Category 3 flood zone announced this week?" needs entity-level reasoning: flood zone classification → properties in zone → policies covering those properties → policyholders — a chain of four typed relationships across four entity types.

Technical Problem

Vector embeddings encode semantic similarity, not relational structure. A vector search cannot traverse relationships. Graph databases excel at relationship traversal but cannot perform semantic similarity search. Neither approach in isolation can answer relationship-heavy questions efficiently. A hybrid architecture that applies each approach to what it does best — semantic retrieval for passage-level context, graph traversal for relationship-level reasoning — is required.

Symptoms

RAG system cannot answer questions about relationships between entities even though those relationships are documented in the corpus
Users who need to trace reasoning paths (auditors, regulators, compliance officers) cannot follow the basis of an AI-generated answer
Complex multi-entity queries require users to manually construct the relationship chain across multiple searches
Standard RAG retrieves many topically relevant but structurally irrelevant documents, forcing the LLM to reason about relationships that should have been traversed at the retrieval layer

Cost of Inaction

Inability to automate complex relational knowledge work (supply chain analysis, clinical trial matching, regulatory impact analysis)
Auditors and regulators cannot accept AI-generated answers without explainable reasoning paths — Graph RAG is often the only approach that satisfies explainability requirements
Missed enterprise intelligence: relationships between entities across the knowledge corpus remain unqueryable by AI

3. Context

When to Apply

Knowledge domains with rich, well-defined entity relationships (biomedical, legal, financial, supply chain)
Use cases requiring explainable reasoning paths as a regulatory or governance requirement
Multi-hop queries where the answer requires traversing 2–5 entity relationship hops
Organisations with existing knowledge graph infrastructure (Neo4j, Amazon Neptune, Azure Cosmos DB Gremlin) that can be extended for RAG
Domains where entity-level precision is more important than document-level recall

When NOT to Apply

Knowledge domains with no significant entity relationship structure (narrative documents without named entities)
Organisations without the data engineering capacity to build and maintain a knowledge graph
Queries that are purely semantic similarity-based with no multi-hop relational structure
Prototyping phases — knowledge graph construction is expensive; standard RAG should be deployed first and Graph RAG added incrementally

Prerequisites

An entity extraction and relationship extraction NLP pipeline (or a curated knowledge graph)
A graph database (Neo4j, Amazon Neptune, Azure Cosmos DB for Apache Gremlin)
A graph-to-text context generation component
An ontology or schema defining the entity types and relationship types in the domain
Entity linking capability to connect documents and text mentions to graph entities

Industry Applicability

Industry	Entity Types	Relationship Types	Multi-Hop Query Example
Biomedical	Drug, Gene, Protein, Disease, Clinical Trial	inhibits, activates, treats, associated_with, tested_in	"Which drugs share a target gene with Drug X and have been tested in clinical trials for Disease Y?"
Financial Services	Company, Person, Regulation, Product, Transaction	owns, regulates, issued_by, counterparty_to	"Which of our clients are counterparties to companies subject to the new sanctions?"
Legal	Case, Statute, Clause, Judge, Ruling	cites, overrules, interprets, decided_by	"Which cases have cited Smith v Jones in employment contexts decided after 2020?"
Supply Chain	Supplier, Product, Component, Facility, Region	supplies, produces, located_in, part_of	"Which Tier 2 suppliers are located in regions affected by the new trade restriction?"
Architecture / Engineering	Component, System, Standard, Test, Failure Mode	part_of, tested_by, compliant_with, exhibits	"Which system components are required by Standard X and have exhibited Failure Mode Y?"

4. Architecture Overview

Graph RAG is architecturally more complex than standard vector RAG because it requires maintaining two complementary knowledge representations — a vector index for semantic retrieval and a knowledge graph for relational traversal — and an orchestration layer that decides when to use each.

Knowledge Graph Construction

The knowledge graph is built through an extraction pipeline that processes the document corpus to identify entities and the relationships between them. The extraction pipeline applies:

Named Entity Recognition (NER): identify entities of defined types (Person, Organisation, Drug, Regulation, etc.) in each document
Coreference Resolution: link pronoun and noun phrase references to the canonical entity they refer to
Relationship Extraction: identify typed relationships between entity mentions (e.g., "Company A acquired Company B" → ACQUIRED(A, B))
Entity Linking: map extracted entity mentions to canonical graph nodes (e.g., "APRA" and "Australian Prudential Regulation Authority" → same node)
Graph Population: write canonical nodes and relationship edges to the graph database with source document provenance

Knowledge graph construction can be automated (using LLM-based extraction — GPT-4, Claude, or specialised models like REBEL) or semi-automated (LLM extraction with human review). For regulated domains (medical, legal), human domain expert review of extracted relationships is strongly recommended before ingestion.

Hybrid Query Routing

When a query arrives, an intent classifier determines whether the query is primarily:

Semantic: "Explain the concept of operational risk" → route to vector retrieval
Relational: "Which entities are connected to X via relationship type Y?" → route to graph traversal
Hybrid: "What does the policy say about entities connected to X?" → parallel vector + graph

For hybrid queries, graph traversal retrieves the relevant entity subgraph (nodes and edges within N hops of the query entity), and this subgraph is serialised as text context and combined with vector-retrieved passages in the context window.

Graph Traversal for Context Assembly

Given a query entity (identified via entity linking), the graph traversal executes a configurable-depth Cypher (Neo4j) or SPARQL query to retrieve the entity neighbourhood: adjacent entities and their relationship types within a specified hop count. For example: MATCH (e:Drug {name: 'Drug X'})-[r1]->(n1)-[r2]->(n2) RETURN e, r1, n1, r2, n2 retrieves all 2-hop neighbours of Drug X with their relationship types.

The retrieved subgraph is serialised into a structured text representation for the LLM context window: "Drug X INHIBITS Enzyme A; Enzyme A INVOLVED_IN Pathway B; Pathway B IMPLICATED_IN Disease Y." This serialisation preserves the relational structure in a form the LLM can reason over, while the graph traversal path constitutes an explicit reasoning chain for explainability.

Combining Graph and Vector Context

The context assembler combines graph-derived context (serialised relationship chains) with vector-retrieved text passages. Graph context provides the relational structure; vector context provides the textual detail. The LLM is prompted to use both and to cite both — graph path citations take the form "[via: Drug X → INHIBITS → Enzyme A → INVOLVED_IN → Pathway B]".

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Ingestion["Knowledge Graph Construction"] A[Document Corpus] B[Entity + Relation Extractor] C[Graph Database] D[Vector Index] end subgraph Query["Hybrid Query Pipeline"] E[User Query] F{Intent Classifier} G[Graph Traversal] H[Vector Search] end subgraph Generation["Context + Generation"] I[Context Assembler] J[LLM + Citations] end A --> B --> C A --> D E --> F F -->|relational| G F -->|semantic| H G --> C H --> D G --> I H --> I I --> J --> E style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f0fdf4,stroke:#22c55e style C fill:#fef9c3,stroke:#eab308 style D fill:#fef9c3,stroke:#eab308 style E fill:#dbeafe,stroke:#3b82f6 style F fill:#f3e8ff,stroke:#a855f7 style G fill:#f0fdf4,stroke:#22c55e style H fill:#f0fdf4,stroke:#22c55e style I fill:#f0fdf4,stroke:#22c55e style J fill:#d1fae5,stroke:#10b981

6. Components

Component	Type	Responsibility	Technology Options	Criticality
Named Entity Recogniser	NLP	Extract entities from documents by type	spaCy, AWS Comprehend, Azure AI Language, fine-tuned BERT	Critical
Relationship Extractor	NLP / LLM	Identify typed relationships between entities	LLM-based (GPT-4 extraction), REBEL, OpenNRE	Critical
Entity Linker	NLP	Map entity mentions to canonical graph nodes	spaCy EntityLinker, REL, Wikipedia/Wikidata linking	High
Graph Database	Storage	Store and traverse entity-relationship graph	Neo4j, Amazon Neptune, Azure Cosmos DB (Gremlin), TigerGraph	Critical
Query Entity Detector	NLP	Identify entities in user query; link to graph nodes	spaCy NER + entity linker; vector similarity to graph node names	High
Query Intent Classifier	ML	Classify query as semantic / relational / hybrid	Fine-tuned classifier or LLM-based routing	High
Graph Traversal Engine	Retrieval	Execute Cypher/SPARQL queries for N-hop entity neighbourhood	Neo4j Python driver, Neptune Gremlin/SPARQL client	Critical
Subgraph Serialiser	Data Processing	Convert retrieved subgraph to LLM-readable text	Custom Python serialiser; template-based linearisation	High
Context Assembler	Orchestration	Combine graph text + vector chunks in LLM prompt	Custom; LangChain graph RAG modules	High
Reasoning Path Extractor	Post-processing	Extract and format graph traversal path for citation and audit	Custom extractor from Cypher query result metadata	High

7. Data Flow

Primary Flow

Step	Actor	Action	Output
1	NER + Relationship Extractor	Extract entities and relationships from corpus documents	Entity list + Relationship triples `(Subject, Predicate, Object)`
2	Entity Linker	Resolve mentions to canonical graph nodes	Canonical entity IDs
3	Graph Database	Ingest nodes and edges; create indexes on entity name, type, and source doc	Populated knowledge graph
4	User	Submit query	Query string
5	Query Entity Detector	Identify and link entities mentioned in query	`{entity_id, entity_type, canonical_name}` for each query entity
6	Query Intent Classifier	Classify query intent: semantic / relational / hybrid	Routing decision
7	Graph Traversal (if relational/hybrid)	Execute N-hop Cypher query from query entities	Subgraph: `{nodes, edges, properties}`
8	Subgraph Serialiser	Convert subgraph to "Entity A RELATIONSHIP Entity B; ..." text	Graph context text
9	Vector Retrieval (if semantic/hybrid)	Execute ANN search on query embedding	Top-K text chunks
10	Context Assembler	Combine graph context + text chunks	Multipart LLM prompt
11	LLM	Generate answer citing both graph paths and text sources	Raw response with citation markers
12	Reasoning Path Extractor	Extract graph traversal path(s) used	Structured citation: `[Entity A → REL → Entity B → REL → Entity C]`
13	Response	Return answer + graph reasoning paths + text citations	Final response

Error Flow

Error Condition	Detection	Recovery
Entity not found in graph (new entity post-graph-construction)	Graph query returns empty result	Fall back to vector retrieval only; flag missing entity for graph update
Graph query timeout (deep traversal on large graph)	Query timeout	Reduce hop depth; use indexed entity traversal; alert graph DBA
Relationship extraction produces incorrect triple	Quality monitoring; human spot check	Human review queue; reject low-confidence triples at extraction
Query intent misclassification (relational query routed to vector only)	User feedback; answer quality monitoring	Fallback to hybrid routing by default; tune intent classifier

8. Security Considerations

Graph Database Access Control

The graph database must enforce node-level and edge-level access controls. In enterprise deployments, not all employees should be able to traverse all entity relationships (e.g., individual employee salary nodes should not be traversable by peers). Graph database access control is less mature than relational database RBAC — evaluate the access control capabilities of your chosen graph database (Neo4j roles, Neptune IAM, Cosmos Gremlin RBAC) carefully.

OWASP LLM Top 10 Mitigations

OWASP LLM Risk	Graph RAG Specific Concern	Mitigation
LLM01: Prompt Injection	Adversarial entity names or relationship labels in the graph	Validate graph node and edge labels against ontology schema; reject non-schema values
LLM06: Sensitive Information Disclosure	Graph traversal reveals entity connections the user should not see	Node-level ACL in graph DB; traversal path filtered by user permissions
LLM08: Excessive Agency	Agent-driven graph traversal with unlimited hop depth could traverse the entire graph	Hard hop-depth limit (≤ 5); traversal node count limit

9. Governance Considerations

Knowledge Graph Quality Governance

The knowledge graph is a critical data asset. Every node and edge must have a provenance record (which document it was extracted from, by which model, and when). Incorrect relationships in the graph can produce systematically incorrect AI answers across all queries that traverse those relationships — a much higher risk than a single incorrect text chunk. Domain expert review of extracted relationships before graph population is strongly recommended for regulated domains.

Graph Schema (Ontology) Governance

The graph schema (entity types, relationship types, cardinality constraints) must be version-controlled and subject to a formal change management process. Schema changes may require partial or full re-extraction of the document corpus.

Governance Artefacts

Artefact	Owner	Frequency	Purpose
Graph Schema (Ontology)	Domain Architect + Knowledge Engineer	Per version	Defines entity and relationship types; reviewed by domain experts
Entity Extraction Precision/Recall Report	AI Operations	Monthly	Validate extraction quality on held-out test set
Graph Population Audit Log	Data Engineering	Per extraction run	Track which documents contributed which nodes and edges
Reasoning Path Audit Archive	Compliance	Per session	Immutable record of every graph traversal path used in a response

10. Operational Considerations

Monitoring

Metric	Alert Threshold	Notes
Graph traversal P99 latency	> 500ms	Index query plan; add graph indexes on high-traversal entity types
Entity linking miss rate	> 15% of query entities	New entities post-construction; trigger graph update
Relationship extraction confidence	< 0.85 average	Model quality degradation; retrain or upgrade extraction model
Graph node count growth rate	Anomalous spike	Potential duplicate entity creation; entity linker quality issue

Service Level Objectives

SLO	Target	Notes
Hybrid Graph+Vector query P95	≤ 4 seconds	Graph traversal adds ~100–300ms vs. vector-only
Entity linking coverage	≥ 85% of query entities linked	Measured monthly
Graph availability	≥ 99.9%	Graph DB cluster health

11. Cost Considerations

Cost Drivers

Cost Driver	Notes
Knowledge graph construction (one-time)	LLM-based extraction: $0.50–$5.00 per document; significant upfront cost for large corpora
Graph database hosting	Neo4j AuraDB: $200–$2,000/month; self-hosted: compute + storage
Ongoing extraction (new documents)	Incremental cost as corpus grows
Human expert review (if required)	$50–$200/hour; significant for regulated domains

Indicative Cost Range

Deployment Scale	Construction Cost (One-Time)	Monthly Operational Cost
Small (< 10K documents)	$5,000 – $20,000	$500 – $2,000
Medium (10K–100K documents)	$20,000 – $100,000	$2,000 – $8,000
Large (> 100K documents)	$100,000 – $500,000	$8,000 – $30,000

12. Trade-Off Analysis

Graph Database Selection

Database	Strengths	Weaknesses	Recommended For
Neo4j	Best-in-class Cypher; mature tooling; AuraDB managed	Cost at scale; less integrated with cloud AI services	Most enterprise deployments
Amazon Neptune	AWS-native; SPARQL + Gremlin; serverless option	Gremlin/SPARQL learning curve; higher latency than Neo4j	AWS-native deployments
Azure Cosmos DB (Gremlin)	Azure-native; globally distributed	Limited graph analytics; Gremlin only	Azure-native deployments
TigerGraph	Highest performance for deep analytics	Commercial; steep learning curve	Large-scale graph analytics

Graph Construction Approach

Approach	Quality	Cost	Maintenance
Manual curation by domain experts	Highest	Very High	Highest
LLM-based extraction + human review	High	Medium	Medium
Fully automated LLM extraction	Medium	Low	Low
Ontology alignment from structured sources (databases)	Very High (for structured domains)	Low	Low

Architectural Tensions

Tension	Trade-off	Recommendation
Graph completeness vs. construction cost	Complete graph: high recall; partial graph: lower cost	Prioritise high-value entity types first; expand incrementally
Deep traversal vs. latency	Deep (5+ hops): comprehensive; shallow (2 hops): fast	Default 2-hop; "deep research" mode up to 4 hops on user request

13. Failure Modes

Failure Mode	Likelihood	Impact	Detection	Recovery
Incorrect relationship in graph (extraction error)	Medium	High	Human spot-check; downstream QA; user feedback	Human review queue; relationship confidence threshold; soft-delete with provenance
Entity disambiguation error (two distinct entities merged)	Medium	High	Entity disambiguation accuracy monitoring	Human review of high-frequency entities; disambiguation test set
Graph schema staleness (new entity type not in ontology)	Low	Medium	Schema coverage monitoring	Schema extension process; re-extraction for new entity types
Large subgraph exceeds LLM context window	Medium	Medium	Token count monitoring before LLM call	Prune subgraph by relationship strength; summarise large subgraphs

14. Regulatory Considerations

Regulation	Requirement	Graph RAG Response
EU AI Act Article 13	AI system outputs must be interpretable and explainable	Graph traversal path is the primary explainability artefact; included in every response on request
APRA CPG 234	Explainable AI for regulated decisions	Reasoning path archived per session; available for regulatory examination
Privacy Act 1988	Entity-level personal information in graph	Person entities require the same ACL controls as documents containing personal information
ISO/IEC 42001	AI system performance and bias monitoring	Entity extraction bias monitoring (which entities are systematically mislinked)

15. Reference Implementations

AWS

Graph DB: Amazon Neptune (Gremlin or SPARQL)
Extraction: SageMaker + custom NER model; Bedrock for LLM-based relationship extraction
Vector index: OpenSearch k-NN
Orchestration: Lambda + Step Functions; LangChain graph RAG

Azure

Graph DB: Azure Cosmos DB for Apache Gremlin, or Neo4j on Azure Marketplace
Extraction: Azure AI Language (NER) + Azure OpenAI for relationship extraction
Vector index: Azure AI Search
Orchestration: Prompt Flow with graph retriever

Self-Hosted

Graph DB: Neo4j Community or Enterprise on Kubernetes
Extraction: spaCy + REBEL (open-source relationship extraction) + Ollama for LLM extraction
Vector index: Weaviate or Qdrant
Orchestration: LangChain GraphCypherQAChain or LlamaIndex KnowledgeGraphIndex

Pattern ID	Pattern Name	Relationship
EAAPL-RAG001	Enterprise RAG	Foundation for the vector retrieval path in hybrid Graph RAG
EAAPL-RAG007	Agentic RAG	Agent can use graph traversal as one of its retrieval tools
EAAPL-KNW001	Enterprise Knowledge Graph	Provides the knowledge graph infrastructure that Graph RAG queries
EAAPL-KNW005	Knowledge Graph for Explainability	Extends Graph RAG with explicit reasoning path explanation generation

17. Maturity Assessment

Overall Maturity: Emerging — Knowledge graph construction tooling is mature (Neo4j, Neptune); LLM-based relationship extraction is proven; end-to-end Graph RAG pipelines are in production at a small number of leading enterprises; the pattern requires significant engineering investment and is not yet commoditised.

Dimension	Score (1–5)	Rationale
Technology Readiness	3	Graph databases are mature; LLM extraction and Graph RAG orchestration are evolving
Tooling Ecosystem	2	LangChain GraphCypherQAChain and LlamaIndex KnowledgeGraphIndex exist but are not production-grade without significant customisation
Operational Guidance	2	Limited production guidance; graph schema governance is organisation-specific
Security & Compliance	2	Node-level ACL in graph databases is less mature than document ACL
Scalability Evidence	2	Medium-scale deployments (< 10M nodes) proven; billion-node scale is specialised
Cost Predictability	2	Construction cost is significant and variable; difficult to estimate without corpus sampling

18. Revision History

Version	Date	Author	Changes
1.0	2024-09-01	EAAPL Working Group	Initial publication
1.1	2025-02-20	EAAPL Working Group	LLM-based extraction formalised; reasoning path citation format specified; regulatory explainability mapping added

← Back to Library More Retrieval-Augmented Generation →