Exploiting Known Taxonomies in Learning Overlapping Concepts

Lijuan Cai, Thomas Hofmann

Many real-world classification problems involve large numbers of overlapping categories that are arranged in a hierarchy or taxonomy. We propose to incorporate prior knowledge on category taxonomy directly into the learning architecture. We present two concrete multi-label classification methods, a generalized version of Perceptron and a hierarchical multi-label SVM learning. Our method works with arbitrary, not necessarily singly connected taxonomies, and can be applied more generally in settings where categories are characterized by attributes and relations that are not necessarily induced by a taxonomy. Experimental results on WIPO-alpha collection show that our hierarchical methods bring significant performance improvement.