Improving Embeddings by Flexible Exploitation of Side Information

Ali Ghodsi, Dana Wilkinson, Finnegan Southey

Dimensionality reduction is a much-studied task in machine learning in which high-dimensional data is mapped, possibly via a non-linear transformation, onto a low-dimensional manifold. The resulting embeddings, however, may fail to capture features of interest. One solution is to learn a distance metric which prefers embeddings that capture the salient features. We propose a novel approach to learning a metric from side information to guide the embedding process. Our approach admits the use of two kinds of side information. The first kind is class-equivalence information, where some limited number of pairwise "same/different class" statements are known. The second form of side information is a limited set of distances between pairs of points in the target metric space. We demonstrate the effectiveness of the method by producing embeddings that capture features of interest.