Topological data analysis of human vowels: Persistent homologies across representation spaces

Topological data analysis is a mathematical framework that characterizes the global shape of complex data by identifying structural features—such as clusters, loops, or holes—that remain stable despite noise or small variations. We studied how vocal signals, example shown in (a), should be represented so that topology-based algorithms can correctly classify it.

We compared the topological features extracted from several signal representations, including spectrograms (b), spectrogram zeros (c), and Takens’ embeddings (d). Using a publicly available dataset of 11,200 recorded vowel utterances, we conducted an empirical analysis demonstrating that these topological features provide additional discriminative information for both speaker and vowel classification. Moreover, features derived from different signal representations appear to be complementary. Our results suggest that low-persistence topological features, often dismissed as “topological noise”, encode important information about speech.

Guillem Bonafos, Pierre Pudlo, Jean-Marc Freyermuth, Samuel Tronçon, and Arnaud Rey.
2026.
Speech Communication 178 (March): 103363 — @HAL