Disentangling Speaker and Language Effects in Cross-Lingual Speaker Verification for Iberian Languages

2026-07-01 • Computation and Language

Computation and Language

AI summaryⓘ

The authors looked at how well speaker verification systems work when people speak different languages. Usually, it's hard to tell if mistakes happen because of the language changes or because different people are being compared. They created a new test with the same people speaking five Iberian languages to separate these effects. Their study found that while some errors come from differences between speakers, most problems come from the language change itself. This helps better understand where the difficulties in cross-lingual speaker verification come from.

cross-lingual speaker verificationlanguage mismatchenrollment utterancetest utteranceHuBERT modelIberian languagesspeaker variabilityCross-Lingual Transfer Matrix (CLTM)bilingual evaluation set

Authors

Pol Buitrago, Javier Hernando

Abstract

Cross-lingual speaker verification (SV) systems typically exhibit performance degradation when enrollment and test utterances are spoken in different languages. However, standard evaluation protocols confound language mismatch with inter-speaker variability, as evaluation is generally performed with different speakers across languages. In this work, we introduce a bilingual same-speaker evaluation set for five Iberian languages, enabling analysis of cross-lingual SV under constant speaker identity. We apply this setup to a HuBERT-based SV system previously shown to exhibit strong language dependence, and analyze results using the Cross-Lingual Transfer Matrix (CLTM) to study pairwise cross-lingual transfer. Our results show that speaker-related variability accounts for part of the observed degradation, but language mismatch remains the main driver of cross-lingual performance loss. These findings provide a more precise characterization of language dependence in cross-lingual SV.

View PDFOpen arXiv