Rhythm of the Deep: A Computational-Linguistic Test of Duality of Patterning in Sperm Whale Codas

2026-06-15 • Artificial Intelligence

Artificial IntelligenceComputation and Language

AI summaryⓘ

The authors studied the sounds made by sperm whales to see if these sounds have a two-level pattern like human language, where small parts combine into bigger units, which then form sequences. They used computer methods to analyze 1,483 whale clicks, finding evidence for two tiers: the first where clicks group together with certain rhythms, and the second where sequences of these groups follow a pattern over time. Their results show the lower-level groups depend more on rhythm, while the higher-level sequences are more stable and structured. The authors do not claim these sounds are language, but their work suggests a kind of dual-patterning structure in whale communication sounds.

Duality of patterningSperm whale codasAcoustic similarityTransfer entropyRhythmSequence dependenceAudio encodingCombinatorial structureComputational linguisticsToken systems

Authors

Mudit Sinha, Sanika Chavan

Abstract

Human language has often been described as combining structure at two levels: lower-level units combine into larger units, which then combine into larger sequences. We test for this design feature, duality of patterning, in sperm whale codas using 1,483 codas from the Dominica Sperm Whale Project. Because acoustic similarity can imitate symbolic structure, we treat the problem as computational-linguistic structure discovery from continuous audio rather than as a direct claim about language or meaning. We use a consensus of frozen audio encoders, held-out structural tests, per-statistic nulls, and acoustic-null recoverability gates. The evidence supports a narrow two-tier architecture. At the lower tier, clicks compose into codas not by a stable ordered rule, but by which clicks are present together with their inter-click rhythm. At the upper tier, coda tokens show bout-level sequential dependence, with an NSB second-order transfer-entropy lift of 0.132 bits (p = 0.002). Under tempo scaling, encoder-derived click identity is strongly rate-bound, while coda identity remains substantially more stable, yielding a measurable abstraction gradient across the click-to-coda step. Rhythm-only baselines recover substantial lower-tier structure but fail to reproduce the upper-tier sequential-dependence signal. We do not claim language, semantics, perception, or human-like phonemes. Instead, we report representation-level evidence for a duality-of-patterning-like architecture whose lower tier is rhythmic rather than segmental, and provide a portable null-controlled framework for testing combinatorial structure in induced acoustic token systems.

View PDFOpen arXiv