Query-Aware Spreading Activation for Multi-Hop Retrieval over Knowledge Graphs

2026-06-29Machine Learning

Machine LearningArtificial IntelligenceInformation Retrieval
AI summary

The authors improved a method for answering complicated questions using knowledge graphs by making the search through the graph more aware of the question at every step. Instead of using a complex process that loads the whole graph into memory, they designed a simpler approach that uses similarity scores and fixed steps within the graph database itself. Their method matches or beats existing systems on some question-answering tests and is faster because it keeps the data inside the database and uses fewer steps. They also showed that their key improvement, the "semantic gate," is crucial for better accuracy and speed.

Retrieval-augmented generationKnowledge graphsMulti-hop question answeringGraph traversalQuery-aware traversalSpreading activationCosine similarityNeo4jCypher querySeed nodes
Authors
Illia Makarov, Mykola Glybovets
Abstract
Retrieval-augmented generation built on knowledge graphs (Graph RAG) outperforms flat passage retrieval on multi-hop question answering by leveraging graph structure. In most existing systems, however, the question only sets the seed nodes; the subsequent traversal becomes "query-blind", depending solely on the graph structure. The exception is QAFD-RAG, which implements query-aware traversal via a flow-diffusion solver with combined edge re-weighting. This architecture requires loading the full graph into Python memory and an iterative solver with a variable number of iterations complicating integration with the graph database. We propose a spreading-activation method that achieves the same query-aware traversal with a single per-step semantic gate: the step weight is the cosine similarity between the candidate entity's description and the question, and the number of iterations is fixed. The whole retrieval procedure - seed mapping, propagation, top-K selection and context assembly - is expressed as a single Cypher query executed in one round-trip to Neo4j; the graph never leaves the database. On MuSiQue our method matches QAFD-RAG by exact match (32.80 vs 33.50) and outperforms the strongest purely-structural baseline in our comparison, HippoRAG, by 5.3 EM and 3.4 F1; on 2WikiMultiHopQA HippoRAG and QAFD-RAG retain an advantage due to their phrase-node architectures. An ablation with the gate disabled confirms that the gate is the source of a simultaneous F1 gain of 3.6 to 7.4 points and a retrieval-latency reduction by a factor of 1.5 to 4.9.