We present a new approach to large-scale graph mining
based on so-called backbone refinement classes. The method
efficiently mines tree-shaped subgraph descriptors under minimum frequency and significance constraints, using classes
of fragments to reduce feature set size and running times.
The classes are defined in terms of fragments sharing a common backbone. The method is able to optimize structural
inter-feature entropy as opposed to occurrences, which is
characteristic for open or closed fragment mining. In the
experiments, the proposed method reduces feature set sizes
by >90 % and >30 % compared to complete tree mining and
open tree mining, respectively. Evaluation using crossvalidation runs shows that their classification accuracy is similar to the complete set of trees but significantly better than
that of open trees. Compared to open or closed fragment
mining, a large part of the search space can be pruned due
to an improved statistical constraint (dynamic upper bound
adjustment), which is also confirmed in the experiments in
lower running times compared to ordinary (static) upper
bound pruning. Further analysis using large-scale datasets
yields insight into important properties of the proposed descriptors, such as the dataset coverage and the class size
represented by each descriptor. A final cross-validation run
confirms that the novel descriptors render large training sets
feasible which previously might have been intractable.
gnuplotでeps
12 年前
0 件のコメント:
コメントを投稿