The 12th ACM Conference on Knowledge Discovery and Data Mining (SIGKDD'2006) Philadelphia, PA, August 20 - 23, 2006 by Jiawei Han*, Xifeng Yan*, and Philip S. Yu^ *Univ. of Illinois at Urbana-Champaign ^IBM T. J. Watson Research Center |
[pdf] |
Scalable methods for mining, indexing, and
similarity search in graphs and other complex structures, such
as sequences, trees, and networks, have become increasingly
important in data mining and database management with broad
applications in social science, the Web, computer vision,
software engineering, chem-informatics, bio-informatics, etc. Graph mining algorithms
such as mining network motif, structural pattern with
constraints and contrast graph pattern, graph clustering, and
graph classification, have been studied extensively in recent
years. The applications built on these algorithms, such as graph
indexing and similarity search, are evolving into new components
of data management system for handling complex structured data.
The motivation for this tutorial is to present to the data
engineering community a comprehensive overview of this growing
area and discuss its potential research and application topics. In this tutorial, we will present a survey on the state of the art in graph and structural pattern mining, indexing and similarity search, and their applications. It will cover the following major themes: scalable methods for mining trees [Zak02, CXYM05], graphs [HCD94, IWM00, KK01, BB02, YH02,YH03], coherent graph patterns [HWB+04], network patterns with constraints [YZH05, PJZ05], graph indexing methods [WZJS94, PF97, GW97, CSF+01, SWG02, CLO03, SK03, YYH04], similarity search in tree and graph databases [WBD98, RGW02, KSBG02, KKSS04, YKT05, YYH05], mining networks in bioinformatics [HWB+04, KGS04, YZH05], mining social networks and the Web [KKR+99, LKF05], and other applications. |
1. Why mining, indexing, and similarity search in
graphs and structured databases? 2. From database systems to data mining and information management systems that handle complex structured data: A road map 3. Mining complex structural patterns: sequences, trees, and graphs (a) Apriori-based method (b) Pattern-growth method (c) Relational graph mining 4. Constrained graph pattern mining (a) Density constraints (b) Connectivity constraints (c) The framework of constraint-based graph mining 5. Indexing complex structures: graph indexing (a) Path-based indexing (b) Frequent discriminative pattern-based indexing 6. Similarity search in tree and graph databases (a) Feature-based distance (b) Tree/Graph edit distance (c) Maximum common subgraph 7. Graph classification (a) Path-based classification (b) Frequent graph-based classification (c) Kernel method 8. Graph clustering and pattern summarization 9. Applications (a) Structural motifs in biological networks (b) Graph classification using structural patterns: (1) chemical compounds, and (2) protein structures |
REFERENCES (mainly Data
Mining and Database) (somehow out of date, I'll try to update it
later) |
