Subtree mining for Question Classification problem

Minh Le Nguyen, Tri Thanh Nguyen, Akira Shimazu

Question Classification, i.e., putting the questions into several semantic categories, is very important for question answering. This paper introduces a new application of using subtree mining for question classification problem. First, we formulate this problem as classifying a tree to a certain label among a set of labels. We then present a use of subtrees in the forest created by the training data to the tree classification problem in which maximum entropy and a boosting model are used as classifiers. Experiments on standard question classification data show that the uses of subtrees along with either maximum entropy or boosting models are promising. The results indicate that our method achieves a comparable or even better performance than kernel methods and also improves testing efficiency.