Flexible Online Representation Learning Based on Similarity Matching
2026-06-01 • Machine Learning
Machine Learning
AI summaryⓘ
The authors present a new learning algorithm that works well with data having many dimensions but only a few important features (sparse high-dimensional data). Their method can find hidden groups (clustering), cover shapes in data (manifold tiling), or learn simple features (sparse coding). Unlike older methods, their algorithm runs efficiently online and respects certain math rules called row sum constraints, which help maintain useful properties like shift invariance. This makes their approach practical for large, complex datasets and biologically inspired learning models.
sparse high-dimensional representationclusteringmanifold tilingsparse codingonline learningrow sum constraintshift invariancecompletely positive matricesdoubly nonnegative matrices
Authors
Shagesh Sridharan, Yanis Bahroun, Anirvan M. Sengupta
Abstract
Sparse high-dimensional representations are conducive to uncovering nontrivial structures in unsupervised exploration of data. Such a representation can deal with the dense connectivity in graphs relevant to community detection problems. However, sparse high-dimensional representations are capable of doing more, including manifold tiling and feature learning. Conventional algorithms optimize in the space of computationally intractable completely positive matrices or relax the problem to the space of doubly nonnegative matrices that scale with sample size in a way rendering them impractical for large data sets. Some of these methods also impose a row sum constraint, such as double stochasticity. Row sum constraints have the added advantage of being shift-invariant, in the context of manifold tiling. Constraints on the row sum of output similarity matrices require nontrivial online learning rules. Addressing these needs, we propose a versatile online biologically plausible learning algorithm capable of learning sparse shift-invariant representations, useful for clustering, manifold tiling, or sparse coding, depending on the data structure.