[EAAPL-RAG009] Graph Retrieval-Augmented Generation
Category: Artificial Intelligence / Retrieval-Augmented Generation
Sub-category: Knowledge Graph and Structured Retrieval
Version: 1.1
Maturity: Emerging
Tags: rag knowledge-graph graph-traversal entity-linking neo4j sparql cypher explainability multi-hop
Regulatory Relevance: EU AI Act Article 13 (Explainability via graph reasoning paths), ISO/IEC 42001 Section 8.4, APRA CPG 234 (explainable AI in regulated decisions)
1. Executive Summary
Graph RAG combines vector-based semantic retrieval with structured knowledge graph traversal to answer complex, relationship-heavy questions with explainable reasoning paths. Where standard vector RAG retrieves semantically similar text passages, Graph RAG traverses explicit entity relationships to assemble contextual evidence — enabling answers to questions like "Which clinical trials reference the same molecular pathway as Drug X, and which of our researchers have published on that pathway?" that require following chains of typed relationships across a knowledge graph.
For enterprise leaders, the key differentiator of Graph RAG is not just retrieval breadth but explainability: every answer can be traced through a graph traversal path that names the entities and relationships used, providing the kind of structured reasoning chain that regulators, auditors, and domain experts can inspect and validate. In regulated industries where AI decisions must be explainable, a graph reasoning path ("Drug X → inhibits → Pathway Y → implicated_in → Disease Z → treated_by → Drug A") is far more auditable than "these five text passages were retrieved and the LLM generated this answer." Graph RAG represents the frontier of enterprise AI knowledge infrastructure — requiring significant investment in knowledge graph construction and maintenance, but delivering retrieval quality and explainability that vector-only approaches cannot match.
2. Problem Statement
Business Problem
Multi-hop relational queries — questions that require following chains of relationships between entities — cannot be answered by vector similarity search because the relevant documents may not be semantically similar to the query; they are structurally connected through a chain of relationships. An insurance underwriter asking "Which of our policyholders have risk factors associated with the newly classified Category 3 flood zone announced this week?" needs entity-level reasoning: flood zone classification → properties in zone → policies covering those properties → policyholders — a chain of four typed relationships across four entity types.
Technical Problem
Vector embeddings encode semantic similarity, not relational structure. A vector search cannot traverse relationships. Graph databases excel at relationship traversal but cannot perform semantic similarity search. Neither approach in isolation can answer relationship-heavy questions efficiently. A hybrid architecture that applies each approach to what it does best — semantic retrieval for passage-level context, graph traversal for relationship-level reasoning — is required.
Symptoms
- RAG system cannot answer questions about relationships between entities even though those relationships are documented in the corpus
- Users who need to trace reasoning paths (auditors, regulators, compliance officers) cannot follow the basis of an AI-generated answer
- Complex multi-entity queries require users to manually construct the relationship chain across multiple searches
- Standard RAG retrieves many topically relevant but structurally irrelevant documents, forcing the LLM to reason about relationships that should have been traversed at the retrieval layer
Cost of Inaction
- Inability to automate complex relational knowledge work (supply chain analysis, clinical trial matching, regulatory impact analysis)
- Auditors and regulators cannot accept AI-generated answers without explainable reasoning paths — Graph RAG is often the only approach that satisfies explainability requirements
- Missed enterprise intelligence: relationships between entities across the knowledge corpus remain unqueryable by AI
3. Context
When to Apply
- Knowledge domains with rich, well-defined entity relationships (biomedical, legal, financial, supply chain)
- Use cases requiring explainable reasoning paths as a regulatory or governance requirement
- Multi-hop queries where the answer requires traversing 2–5 entity relationship hops
- Organisations with existing knowledge graph infrastructure (Neo4j, Amazon Neptune, Azure Cosmos DB Gremlin) that can be extended for RAG
- Domains where entity-level precision is more important than document-level recall
When NOT to Apply
- Knowledge domains with no significant entity relationship structure (narrative documents without named entities)
- Organisations without the data engineering capacity to build and maintain a knowledge graph
- Queries that are purely semantic similarity-based with no multi-hop relational structure
- Prototyping phases — knowledge graph construction is expensive; standard RAG should be deployed first and Graph RAG added incrementally
Prerequisites
- An entity extraction and relationship extraction NLP pipeline (or a curated knowledge graph)
- A graph database (Neo4j, Amazon Neptune, Azure Cosmos DB for Apache Gremlin)
- A graph-to-text context generation component
- An ontology or schema defining the entity types and relationship types in the domain
- Entity linking capability to connect documents and text mentions to graph entities
Industry Applicability
| Industry |
Entity Types |
Relationship Types |
Multi-Hop Query Example |
| Biomedical |
Drug, Gene, Protein, Disease, Clinical Trial |
inhibits, activates, treats, associated_with, tested_in |
"Which drugs share a target gene with Drug X and have been tested in clinical trials for Disease Y?" |
| Financial Services |
Company, Person, Regulation, Product, Transaction |
owns, regulates, issued_by, counterparty_to |
"Which of our clients are counterparties to companies subject to the new sanctions?" |
| Legal |
Case, Statute, Clause, Judge, Ruling |
cites, overrules, interprets, decided_by |
"Which cases have cited Smith v Jones in employment contexts decided after 2020?" |
| Supply Chain |
Supplier, Product, Component, Facility, Region |
supplies, produces, located_in, part_of |
"Which Tier 2 suppliers are located in regions affected by the new trade restriction?" |
| Architecture / Engineering |
Component, System, Standard, Test, Failure Mode |
part_of, tested_by, compliant_with, exhibits |
"Which system components are required by Standard X and have exhibited Failure Mode Y?" |
4. Architecture Overview
Graph RAG is architecturally more complex than standard vector RAG because it requires maintaining two complementary knowledge representations — a vector index for semantic retrieval and a knowledge graph for relational traversal — and an orchestration layer that decides when to use each.
Knowledge Graph Construction
The knowledge graph is built through an extraction pipeline that processes the document corpus to identify entities and the relationships between them. The extraction pipeline applies:
- Named Entity Recognition (NER): identify entities of defined types (Person, Organisation, Drug, Regulation, etc.) in each document
- Coreference Resolution: link pronoun and noun phrase references to the canonical entity they refer to
- Relationship Extraction: identify typed relationships between entity mentions (e.g., "Company A acquired Company B" → ACQUIRED(A, B))
- Entity Linking: map extracted entity mentions to canonical graph nodes (e.g., "APRA" and "Australian Prudential Regulation Authority" → same node)
- Graph Population: write canonical nodes and relationship edges to the graph database with source document provenance
Knowledge graph construction can be automated (using LLM-based extraction — GPT-4, Claude, or specialised models like REBEL) or semi-automated (LLM extraction with human review). For regulated domains (medical, legal), human domain expert review of extracted relationships is strongly recommended before ingestion.
Hybrid Query Routing
When a query arrives, an intent classifier determines whether the query is primarily:
- Semantic: "Explain the concept of operational risk" → route to vector retrieval
- Relational: "Which entities are connected to X via relationship type Y?" → route to graph traversal
- Hybrid: "What does the policy say about entities connected to X?" → parallel vector + graph
For hybrid queries, graph traversal retrieves the relevant entity subgraph (nodes and edges within N hops of the query entity), and this subgraph is serialised as text context and combined with vector-retrieved passages in the context window.
Graph Traversal for Context Assembly
Given a query entity (identified via entity linking), the graph traversal executes a configurable-depth Cypher (Neo4j) or SPARQL query to retrieve the entity neighbourhood: adjacent entities and their relationship types within a specified hop count. For example: MATCH (e:Drug {name: 'Drug X'})-[r1]->(n1)-[r2]->(n2) RETURN e, r1, n1, r2, n2 retrieves all 2-hop neighbours of Drug X with their relationship types.
The retrieved subgraph is serialised into a structured text representation for the LLM context window: "Drug X INHIBITS Enzyme A; Enzyme A INVOLVED_IN Pathway B; Pathway B IMPLICATED_IN Disease Y." This serialisation preserves the relational structure in a form the LLM can reason over, while the graph traversal path constitutes an explicit reasoning chain for explainability.
Combining Graph and Vector Context
The context assembler combines graph-derived context (serialised relationship chains) with vector-retrieved text passages. Graph context provides the relational structure; vector context provides the textual detail. The LLM is prompted to use both and to cite both — graph path citations take the form "[via: Drug X → INHIBITS → Enzyme A → INVOLVED_IN → Pathway B]".
5. Architecture Diagram
flowchart TD
subgraph Ingestion["Knowledge Graph Construction"]
A[Document Corpus]
B[Entity + Relation Extractor]
C[Graph Database]
D[Vector Index]
end
subgraph Query["Hybrid Query Pipeline"]
E[User Query]
F{Intent Classifier}
G[Graph Traversal]
H[Vector Search]
end
subgraph Generation["Context + Generation"]
I[Context Assembler]
J[LLM + Citations]
end
A --> B --> C
A --> D
E --> F
F -->|relational| G
F -->|semantic| H
G --> C
H --> D
G --> I
H --> I
I --> J --> E
style A fill:#dbeafe,stroke:#3b82f6
style B fill:#f0fdf4,stroke:#22c55e
style C fill:#fef9c3,stroke:#eab308
style D fill:#fef9c3,stroke:#eab308
style E fill:#dbeafe,stroke:#3b82f6
style F fill:#f3e8ff,stroke:#a855f7
style G fill:#f0fdf4,stroke:#22c55e
style H fill:#f0fdf4,stroke:#22c55e
style I fill:#f0fdf4,stroke:#22c55e
style J fill:#d1fae5,stroke:#10b981
6. Components
| Component |
Type |
Responsibility |
Technology Options |
Criticality |
| Named Entity Recogniser |
NLP |
Extract entities from documents by type |
spaCy, AWS Comprehend, Azure AI Language, fine-tuned BERT |
Critical |
| Relationship Extractor |
NLP / LLM |
Identify typed relationships between entities |
LLM-based (GPT-4 extraction), REBEL, OpenNRE |
Critical |
| Entity Linker |
NLP |
Map entity mentions to canonical graph nodes |
spaCy EntityLinker, REL, Wikipedia/Wikidata linking |
High |
| Graph Database |
Storage |
Store and traverse entity-relationship graph |
Neo4j, Amazon Neptune, Azure Cosmos DB (Gremlin), TigerGraph |
Critical |
| Query Entity Detector |
NLP |
Identify entities in user query; link to graph nodes |
spaCy NER + entity linker; vector similarity to graph node names |
High |
| Query Intent Classifier |
ML |
Classify query as semantic / relational / hybrid |
Fine-tuned classifier or LLM-based routing |
High |
| Graph Traversal Engine |
Retrieval |
Execute Cypher/SPARQL queries for N-hop entity neighbourhood |
Neo4j Python driver, Neptune Gremlin/SPARQL client |
Critical |
| Subgraph Serialiser |
Data Processing |
Convert retrieved subgraph to LLM-readable text |
Custom Python serialiser; template-based linearisation |
High |
| Context Assembler |
Orchestration |
Combine graph text + vector chunks in LLM prompt |
Custom; LangChain graph RAG modules |
High |
| Reasoning Path Extractor |
Post-processing |
Extract and format graph traversal path for citation and audit |
Custom extractor from Cypher query result metadata |
High |
7. Data Flow
Primary Flow
| Step |
Actor |
Action |
Output |
| 1 |
NER + Relationship Extractor |
Extract entities and relationships from corpus documents |
Entity list + Relationship triples (Subject, Predicate, Object) |
| 2 |
Entity Linker |
Resolve mentions to canonical graph nodes |
Canonical entity IDs |
| 3 |
Graph Database |
Ingest nodes and edges; create indexes on entity name, type, and source doc |
Populated knowledge graph |
| 4 |
User |
Submit query |
Query string |
| 5 |
Query Entity Detector |
Identify and link entities mentioned in query |
{entity_id, entity_type, canonical_name} for each query entity |
| 6 |
Query Intent Classifier |
Classify query intent: semantic / relational / hybrid |
Routing decision |
| 7 |
Graph Traversal (if relational/hybrid) |
Execute N-hop Cypher query from query entities |
Subgraph: {nodes, edges, properties} |
| 8 |
Subgraph Serialiser |
Convert subgraph to "Entity A RELATIONSHIP Entity B; ..." text |
Graph context text |
| 9 |
Vector Retrieval (if semantic/hybrid) |
Execute ANN search on query embedding |
Top-K text chunks |
| 10 |
Context Assembler |
Combine graph context + text chunks |
Multipart LLM prompt |
| 11 |
LLM |
Generate answer citing both graph paths and text sources |
Raw response with citation markers |
| 12 |
Reasoning Path Extractor |
Extract graph traversal path(s) used |
Structured citation: [Entity A → REL → Entity B → REL → Entity C] |
| 13 |
Response |
Return answer + graph reasoning paths + text citations |
Final response |
Error Flow
| Error Condition |
Detection |
Recovery |
| Entity not found in graph (new entity post-graph-construction) |
Graph query returns empty result |
Fall back to vector retrieval only; flag missing entity for graph update |
| Graph query timeout (deep traversal on large graph) |
Query timeout |
Reduce hop depth; use indexed entity traversal; alert graph DBA |
| Relationship extraction produces incorrect triple |
Quality monitoring; human spot check |
Human review queue; reject low-confidence triples at extraction |
| Query intent misclassification (relational query routed to vector only) |
User feedback; answer quality monitoring |
Fallback to hybrid routing by default; tune intent classifier |
8. Security Considerations
Graph Database Access Control
The graph database must enforce node-level and edge-level access controls. In enterprise deployments, not all employees should be able to traverse all entity relationships (e.g., individual employee salary nodes should not be traversable by peers). Graph database access control is less mature than relational database RBAC — evaluate the access control capabilities of your chosen graph database (Neo4j roles, Neptune IAM, Cosmos Gremlin RBAC) carefully.
OWASP LLM Top 10 Mitigations
| OWASP LLM Risk |
Graph RAG Specific Concern |
Mitigation |
| LLM01: Prompt Injection |
Adversarial entity names or relationship labels in the graph |
Validate graph node and edge labels against ontology schema; reject non-schema values |
| LLM06: Sensitive Information Disclosure |
Graph traversal reveals entity connections the user should not see |
Node-level ACL in graph DB; traversal path filtered by user permissions |
| LLM08: Excessive Agency |
Agent-driven graph traversal with unlimited hop depth could traverse the entire graph |
Hard hop-depth limit (≤ 5); traversal node count limit |
9. Governance Considerations
Knowledge Graph Quality Governance
The knowledge graph is a critical data asset. Every node and edge must have a provenance record (which document it was extracted from, by which model, and when). Incorrect relationships in the graph can produce systematically incorrect AI answers across all queries that traverse those relationships — a much higher risk than a single incorrect text chunk. Domain expert review of extracted relationships before graph population is strongly recommended for regulated domains.
Graph Schema (Ontology) Governance
The graph schema (entity types, relationship types, cardinality constraints) must be version-controlled and subject to a formal change management process. Schema changes may require partial or full re-extraction of the document corpus.
Governance Artefacts
| Artefact |
Owner |
Frequency |
Purpose |
| Graph Schema (Ontology) |
Domain Architect + Knowledge Engineer |
Per version |
Defines entity and relationship types; reviewed by domain experts |
| Entity Extraction Precision/Recall Report |
AI Operations |
Monthly |
Validate extraction quality on held-out test set |
| Graph Population Audit Log |
Data Engineering |
Per extraction run |
Track which documents contributed which nodes and edges |
| Reasoning Path Audit Archive |
Compliance |
Per session |
Immutable record of every graph traversal path used in a response |
10. Operational Considerations
Monitoring
| Metric |
Alert Threshold |
Notes |
| Graph traversal P99 latency |
> 500ms |
Index query plan; add graph indexes on high-traversal entity types |
| Entity linking miss rate |
> 15% of query entities |
New entities post-construction; trigger graph update |
| Relationship extraction confidence |
< 0.85 average |
Model quality degradation; retrain or upgrade extraction model |
| Graph node count growth rate |
Anomalous spike |
Potential duplicate entity creation; entity linker quality issue |
Service Level Objectives
| SLO |
Target |
Notes |
| Hybrid Graph+Vector query P95 |
≤ 4 seconds |
Graph traversal adds ~100–300ms vs. vector-only |
| Entity linking coverage |
≥ 85% of query entities linked |
Measured monthly |
| Graph availability |
≥ 99.9% |
Graph DB cluster health |
11. Cost Considerations
Cost Drivers
| Cost Driver |
Notes |
| Knowledge graph construction (one-time) |
LLM-based extraction: $0.50–$5.00 per document; significant upfront cost for large corpora |
| Graph database hosting |
Neo4j AuraDB: $200–$2,000/month; self-hosted: compute + storage |
| Ongoing extraction (new documents) |
Incremental cost as corpus grows |
| Human expert review (if required) |
$50–$200/hour; significant for regulated domains |
Indicative Cost Range
| Deployment Scale |
Construction Cost (One-Time) |
Monthly Operational Cost |
| Small (< 10K documents) |
$5,000 – $20,000 |
$500 – $2,000 |
| Medium (10K–100K documents) |
$20,000 – $100,000 |
$2,000 – $8,000 |
| Large (> 100K documents) |
$100,000 – $500,000 |
$8,000 – $30,000 |
12. Trade-Off Analysis
Graph Database Selection
| Database |
Strengths |
Weaknesses |
Recommended For |
| Neo4j |
Best-in-class Cypher; mature tooling; AuraDB managed |
Cost at scale; less integrated with cloud AI services |
Most enterprise deployments |
| Amazon Neptune |
AWS-native; SPARQL + Gremlin; serverless option |
Gremlin/SPARQL learning curve; higher latency than Neo4j |
AWS-native deployments |
| Azure Cosmos DB (Gremlin) |
Azure-native; globally distributed |
Limited graph analytics; Gremlin only |
Azure-native deployments |
| TigerGraph |
Highest performance for deep analytics |
Commercial; steep learning curve |
Large-scale graph analytics |
Graph Construction Approach
| Approach |
Quality |
Cost |
Maintenance |
| Manual curation by domain experts |
Highest |
Very High |
Highest |
| LLM-based extraction + human review |
High |
Medium |
Medium |
| Fully automated LLM extraction |
Medium |
Low |
Low |
| Ontology alignment from structured sources (databases) |
Very High (for structured domains) |
Low |
Low |
Architectural Tensions
| Tension |
Trade-off |
Recommendation |
| Graph completeness vs. construction cost |
Complete graph: high recall; partial graph: lower cost |
Prioritise high-value entity types first; expand incrementally |
| Deep traversal vs. latency |
Deep (5+ hops): comprehensive; shallow (2 hops): fast |
Default 2-hop; "deep research" mode up to 4 hops on user request |
13. Failure Modes
| Failure Mode |
Likelihood |
Impact |
Detection |
Recovery |
| Incorrect relationship in graph (extraction error) |
Medium |
High |
Human spot-check; downstream QA; user feedback |
Human review queue; relationship confidence threshold; soft-delete with provenance |
| Entity disambiguation error (two distinct entities merged) |
Medium |
High |
Entity disambiguation accuracy monitoring |
Human review of high-frequency entities; disambiguation test set |
| Graph schema staleness (new entity type not in ontology) |
Low |
Medium |
Schema coverage monitoring |
Schema extension process; re-extraction for new entity types |
| Large subgraph exceeds LLM context window |
Medium |
Medium |
Token count monitoring before LLM call |
Prune subgraph by relationship strength; summarise large subgraphs |
14. Regulatory Considerations
| Regulation |
Requirement |
Graph RAG Response |
| EU AI Act Article 13 |
AI system outputs must be interpretable and explainable |
Graph traversal path is the primary explainability artefact; included in every response on request |
| APRA CPG 234 |
Explainable AI for regulated decisions |
Reasoning path archived per session; available for regulatory examination |
| Privacy Act 1988 |
Entity-level personal information in graph |
Person entities require the same ACL controls as documents containing personal information |
| ISO/IEC 42001 |
AI system performance and bias monitoring |
Entity extraction bias monitoring (which entities are systematically mislinked) |
15. Reference Implementations
AWS
- Graph DB: Amazon Neptune (Gremlin or SPARQL)
- Extraction: SageMaker + custom NER model; Bedrock for LLM-based relationship extraction
- Vector index: OpenSearch k-NN
- Orchestration: Lambda + Step Functions; LangChain graph RAG
Azure
- Graph DB: Azure Cosmos DB for Apache Gremlin, or Neo4j on Azure Marketplace
- Extraction: Azure AI Language (NER) + Azure OpenAI for relationship extraction
- Vector index: Azure AI Search
- Orchestration: Prompt Flow with graph retriever
Self-Hosted
- Graph DB: Neo4j Community or Enterprise on Kubernetes
- Extraction: spaCy + REBEL (open-source relationship extraction) + Ollama for LLM extraction
- Vector index: Weaviate or Qdrant
- Orchestration: LangChain GraphCypherQAChain or LlamaIndex KnowledgeGraphIndex
| Pattern ID |
Pattern Name |
Relationship |
| EAAPL-RAG001 |
Enterprise RAG |
Foundation for the vector retrieval path in hybrid Graph RAG |
| EAAPL-RAG007 |
Agentic RAG |
Agent can use graph traversal as one of its retrieval tools |
| EAAPL-KNW001 |
Enterprise Knowledge Graph |
Provides the knowledge graph infrastructure that Graph RAG queries |
| EAAPL-KNW005 |
Knowledge Graph for Explainability |
Extends Graph RAG with explicit reasoning path explanation generation |
17. Maturity Assessment
Overall Maturity: Emerging — Knowledge graph construction tooling is mature (Neo4j, Neptune); LLM-based relationship extraction is proven; end-to-end Graph RAG pipelines are in production at a small number of leading enterprises; the pattern requires significant engineering investment and is not yet commoditised.
| Dimension |
Score (1–5) |
Rationale |
| Technology Readiness |
3 |
Graph databases are mature; LLM extraction and Graph RAG orchestration are evolving |
| Tooling Ecosystem |
2 |
LangChain GraphCypherQAChain and LlamaIndex KnowledgeGraphIndex exist but are not production-grade without significant customisation |
| Operational Guidance |
2 |
Limited production guidance; graph schema governance is organisation-specific |
| Security & Compliance |
2 |
Node-level ACL in graph databases is less mature than document ACL |
| Scalability Evidence |
2 |
Medium-scale deployments (< 10M nodes) proven; billion-node scale is specialised |
| Cost Predictability |
2 |
Construction cost is significant and variable; difficult to estimate without corpus sampling |
18. Revision History
| Version |
Date |
Author |
Changes |
| 1.0 |
2024-09-01 |
EAAPL Working Group |
Initial publication |
| 1.1 |
2025-02-20 |
EAAPL Working Group |
LLM-based extraction formalised; reasoning path citation format specified; regulatory explainability mapping added |