Construction of Historical Knowledge Graphs Based on BERT and Graph Neural Networks

2026-06-01 • Computation and Language

Computation and LanguageArtificial Intelligence

AI summaryⓘ

The authors propose a system that combines two AI techniques—BERT, which understands language context, and graph neural networks, which handle relationships between entities—to turn old historical texts into structured knowledge graphs. They address language challenges in traditional history texts, such as unclear references and inconsistent grammar. Their new approach, tested on various historical documents, performs better than previous methods in accurately identifying important information and connections. This work shows how linking language understanding with graph learning helps automate building detailed historical data collections.

BERTGraph Neural Networks (GNN)Knowledge GraphsDigital HumanitiesHistorical Text AnalysisNatural Language ProcessingEntity ExtractionRelationship ExtractionFastRQNetVision-Language Models

Authors

Ping Li, Bartlomiej Brzozka

Abstract

Through digital humanities research and scale-up historical data analysis, a significant amount of traditional historical text is converted into structured knowledge graphs. This paper provides a high-level architecture that combines bidirectional encoder representations of transformers (BERT) and graph neural networks (GNN) to extract the entities and relationships from various types of historical texts. The texts of traditional history resolve linguistic ambiguities, references limited by context, and a lack of established grammatical norms in a systematic way. This study develops a new image retrieval system based on FastRQNet and pre-trained vision-language model Vilt-qaformer+RoBInet in accordance with the aforementioned recommendations. The experiments make full use of a comprehensive collection of municipal records, parliamentary documents, and historical correspondence. When compared to conventional rule-based techniques and other popular deep-learning baselines, the joint BERT-GNN system obtains greater Precision, Recall, and F1-score (Table 2). Complex nested structures and implicit reference issues can be handled by this structure with sufficient accuracy and thoroughness when creating knowledge graphs. The aforementioned experiments show that combining relational graph learning algorithms with context-sensitive semantic representation techniques can automatically extract historical data to add accumulated wisdom to the knowledge repository.

View PDFOpen arXiv