Department of Applied Mathematics & Physics, Kyoto University
Technical Report 2009-011 (April 10, 2009)
Classification via Visualization of Sample-feature Bipartite Graphs
by Kazuya Haraguchi, Seok-Hee Hong, Hiroshi Nagamochi
Visualization plays an important role as an effective analysis tool for
huge and complex data sets in many application domains such
as financial market, computer networks, biology and sociology.
However, in many cases, data sets are processed
by existing analysis techniques (e.g., classification, clustering, PCA)
before applying visualization.
In this paper, we study visual analysis of classification problem,
a significant research issue in machine learning and data mining
community. The problem asks to construct a classifier from given set of
positive and negative samples that predicts the classes of future
samples with high accuracy.
We first extract a bipartite graph structure from the sample set,
which consists of a set of samples and a set of subsets of attributes.
We then propose an algorithm that constructs a two-layered drawing
of the bipartite graph, by permuting the nodes using an edge crossing minimization technique.
The resulting drawing can act as a new classifier.
Surprisingly, experimental results on bench mark data sets
show that our new classifier is competitive with a well-known decision tree generator C4.5
in terms of prediction error.
Furthermore, the ordering of samples from the resulting drawing enables us to derive
new analysis and insight into data such as clustering.