Mapping Political-Elite Networks in Europe with a Multilingual Joint Entity-Relation Extraction Pipeline

2026-06-25Computation and Language

Computation and Language
AI summary

The authors created an open and multilingual system that reads a large amount of news text to find and connect political actors and their relationships over time. Their method uses advanced techniques to recognize names, link them to global identifiers, and then figure out positive or negative relationships based on a predefined framework. They tested their system and found it to be quite accurate in understanding these complex political networks. The authors demonstrated its usefulness by analyzing political events in Austria and Poland, showing how parties and networks form and conflict. This work helps turn messy news text into organized data for studying politics across different countries.

named-entity recognitionknowledge graphsentity-relation extractionlarge language modelsWikidata identifiersontologysigned networkscross-lingual text analysispolitical networkscomputational social science
Authors
Kirill Solovev, Jana Lasser
Abstract
Whether political elites organise into rent-seeking coalitions that capture public resources or civic networks that sustain governance is a central question in comparative politics. Yet observing these complex, informal, and adversarial ties at scale has historically required intensive manual coding, while automated text-as-data methods have largely been limited to simple co-occurrence. Recent large language model (LLM) approaches offer a path forward but often rely on proprietary APIs, lack cross-lingual capability, and struggle with scalable entity resolution. We present a modular, fully open-weight pipeline for multilingual joint entity-relation extraction that builds signed, temporal knowledge graphs from massive unstructured news corpora. It combines span-based named-entity recognition (NER) with a three-stage linking cascade mapping mentions to language-independent Wikidata identifiers; a high-throughput, ontology-constrained mixture-of-experts model then uses guided decoding to extract directed, signed relationships grounded in a domain ontology. A full-coverage spot-check against a 3491-relation gold standard shows high textual correctness (68.2% strict to 93.7% lenient). Two large-scale case studies validate the pipeline against the public record. In Austria, it reconstructs a political party's complete lifecycle, dating internal fractures and tracking personnel into successor factions and court convictions. In a Polish corpus, it uncovers the overlapping economic and governance networks of state-enterprise patronage, alongside the structurally balanced, signed conflict network of the polarized Civic Platform (Platforma Obywatelska, PO)--Law and Justice (Prawo i Sprawiedliwość, PiS) duopoly. By bridging raw multilingual text and structured relational data, our framework provides a robust, replicable foundation for cross-national empirical computational social science.