AI summaryⓘ
The authors explain that current graph query languages like GQL and SQL/PGQ can't easily combine smaller queries into bigger ones, which limits what they can express, especially for certain complex problems. They point out that although these languages handle basic graph searches well, they miss out on problems requiring more advanced computational logic. To fix this, the authors propose a new, more flexible language that lets users build complex queries by combining simpler graph patterns and relational data. Their approach uses improved definitions of path queries and a new compositional language called #Datalog to create and manipulate graph elements more effectively. They suggest adding these ideas to future versions of GQL and SQL/PGQ to enhance their power and completeness.
GQLSQL/PGQgraph query languagescompositionalityNLOGSPACEregular path queries#Dataloggraph reachabilityfirst-order queriesknowledge graphs
Authors
Marcelo Arenas, Leonid Libkin, Wim Martens
Abstract
A major shortcoming of the recently standardized graph query languages GQL and SQL/PGQ is their lack of compositionality. Given the importance of these languages in querying knowledge graphs, we address this shortcoming and propose both theoretical solutions and a path to adding them to the new standards. The highlight of the non-compositionality problem is that while both GQL and SQL/PGQ can express graph reachability and all first-order queries, they fall short of the problems in NLOGSPACE. In view of the completeness of reachability for NLOGSPACE under first-order reductions, this is extremely counterintuitive. The issue is well recognized by the standards committee that has been searching for language extensions to fill the gaps at the level of some specific inexpressible queries. We address the issue in a systematic way and propose a language that fills expressivity gaps by allowing full compositionality between graph patterns and relational queries. It does so by using two key components: a cleaned up definition of regular path queries with variables and data value comparisons, and a fully compositional graph-to-graph language #Datalog with complete support for constructing new graph elements from nodes, edges, lists of nodes and edges, and even entire paths. We show that the resulting language addresses the issues facing the standards committee, and propose a concrete addition to GQL and SQL/PGQ that incorporates its main features.