Department of Applied Mathematics & Physics, Kyoto University

Technical Report 2009-011 (April 10, 2009)

Classification via Visualization of Sample-feature Bipartite Graphs
by Kazuya Haraguchi, Seok-Hee Hong, Hiroshi Nagamochi

pdf File

Visualization plays an important role as an effective analysis tool for huge and complex data sets in many application domains such as financial market, computer networks, biology and sociology. However, in many cases, data sets are processed by existing analysis techniques (e.g., classification, clustering, PCA) before applying visualization. In this paper, we study visual analysis of classification problem, a significant research issue in machine learning and data mining community. The problem asks to construct a classifier from given set of positive and negative samples that predicts the classes of future samples with high accuracy. We first extract a bipartite graph structure from the sample set, which consists of a set of samples and a set of subsets of attributes. We then propose an algorithm that constructs a two-layered drawing of the bipartite graph, by permuting the nodes using an edge crossing minimization technique. The resulting drawing can act as a new classifier. Surprisingly, experimental results on bench mark data sets show that our new classifier is competitive with a well-known decision tree generator C4.5 in terms of prediction error. Furthermore, the ordering of samples from the resulting drawing enables us to derive new analysis and insight into data such as clustering.