8. KDD 2002: Edmonton, Alberta, Canada
- Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 23-26, 2002, Edmonton, Alberta, Canada. ACM 2002, ISBN 1-58113-567-X
Statistical methods I
- Fatemah A. Alqallaf, Kjell P. Konis, R. Douglas Martin, Ruben H. Zamar:
Scalable robust covariance and correlation estimates for data mining. 14-23 - Kristin P. Bennett, Michinari Momma, Mark J. Embrechts:
MARK: a boosting algorithm for heterogeneous kernel models. 24-31
Frequent patterns I
- Pang-Ning Tan, Vipin Kumar, Jaideep Srivastava:
Selecting the right interestingness measure for association patterns. 32-41 - Cristian Bucila, Johannes Gehrke, Daniel Kifer, Walker M. White:
DualMiner: a dual-pruning algorithm for itemsets with constraints. 42-51
Graphs and trees
- Christopher R. Palmer, Phillip B. Gibbons, Christos Faloutsos:
ANF: a fast and scalable tool for data mining in massive graphs. 81-90
Streams and time series
- Eamonn J. Keogh, Shruti Kasetty:
On the need for time series data mining benchmarks: a survey and empirical demonstration. 102-111
Visualization
- Chris Stolte, Diane Tang, Pat Hanrahan:
Query, analysis, and visualization of hierarchically structured data using Polaris. 112-122 - Jörg A. Walter, Helge J. Ritter:
On interactive visualization of high-dimensional data using the hyperbolic plane. 123-132
Web search and navigation
- Corin R. Anderson, Pedro M. Domingos, Daniel S. Weld:
Relational Markov models and their application to adaptive web navigation. 143-152
Sequences and strings
Statistical methods II
- Jeremy Tantrum, Alejandro Murua, Werner Stuetzle:
Hierarchical model-based clustering of large datasets through fractionation and refractionation. 183-190
Text classification
- Inderjit S. Dhillon, Subramanyam Mallela, Rahul Kumar:
Enhanced word clustering for hierarchical text classification. 191-200 - Canasai Kruengkrai, Chuleerat Jaruskulchai:
A parallel learning algorithm for text classification. 201-206 - Haoran Wu, Tong-Heng Phang, Bing Liu, Xiaoli Li:
A refinement approach to handling model misfit in text categorization. 207-216
Frequent patterns II
- Alexandre V. Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, Johannes Gehrke:
Privacy preserving mining of association rules. 217-228 - Junqiang Liu, Yunhe Pan, Ke Wang, Jiawei Han:
Mining frequent item sets by opportunistic projection. 229-238
Web page classification
- Hwanjo Yu, Jiawei Han, Kevin Chen-Chuan Chang:
PEBL: positive example based learning for Web page classification using SVM. 239-248 - Martin Ester, Hans-Peter Kriegel, Matthias Schubert:
Web site mining: a new way to spot competitors, customers and suppliers in the world wide web. 249-258
Learning methods
- Edwin P. D. Pednault, Naoki Abe, Bianca Zadrozny:
Sequential cost-sensitive decision making with reinforcement learning. 259-268
Intrusion and privacy
- Kristin P. Bennett, Ayhan Demiriz, Richard Maclin:
Exploiting unlabeled data in ensemble methods. 289-296
Ensembles and boosting
- Mahesh V. Joshi, Ramesh C. Agarwal, Vipin Kumar:
Predicting rare classes: can boosting make any weak learner strong? 297-306 - Aleksander Kolcz, Xiaomei Sun, Jugal K. Kalita:
Efficient handling of high-dimensional feature spaces by randomized classifier ensembles. 307-313
Industry track papers
- Mohammad El-Ramly, Eleni Stroulia, Paul G. Sorenson:
From run-time behavior to usage scenarios: an interaction-pattern mining approach. 315-324 - Andrew Storey, Marc-David Cohen:
Exploiting response models: optimizing cross-sell and up-sell opportunities in banking. 325-331 - Saharon Rosset, Einat Neumann, Uri Eick, Nurit Vatnik, Yizhak Idan:
Customer lifetime value modeling and its use for customer retention planning. 332-340 - Satoshi Morinaga, Kenji Yamanishi, Kenji Tateishi, Toshikazu Fukushima:
Mining product reputations on the Web. 341-349 - Sheila Tejada, Craig A. Knoblock, Steven Minton:
Learning domain-independent string transformation weights for high accuracy object identification. 350-359 - Matthew V. Mahoney, Philip K. Chan:
Learning nonstationary models of normal network traffic for detecting novel attacks. 376-385 - Alexander Tuzhilin, Gediminas Adomavicius:
Handling very large numbers of association rules in the analysis of microarray data. 396-404 - Peter Antal, Patrick Glenisson, Geert Fannes:
On the potential of domain literature for clustering and Bayesian network learning. 405-414 - Yulan Liang, Arpad Kelemen:
Mining heterogeneous gene expression data with time lagged recurrent neural networks. 415-421
Poster papers
- Charu C. Aggarwal:
Collaborative crawling: mining user experiences for topical resource discovery. 423-428 - Jay Ayres, Jason Flannick, Johannes Gehrke, Tomi Yiu:
Sequential PAttern mining using a bitmap representation. 429-435 - Shai Ben-David, Johannes Gehrke, Reba Schuller:
A theoretical framework for learning from a pool of disparate data sources. 443-449 - Bin Chen, Peter J. Haas, Peter Scheuermann:
A new two-phase sampling based algorithm for discovering association rules. 462-468 - Christina Yip Chung, Bin Chen:
CVS: a Correlation-Verification based Smoothing technique on information retrieval and term clustering. 469-474 - William W. Cohen, Jacob Richman:
Learning to match and cluster large high-dimensional data sets for data integration. 475-480 - Tina Eliassi-Rad, Terence Critchlow, Ghaleb Abdulla:
Tina Eliassi-Rad, Terence Critchlow, Ghaleb Abdulla. 488-494 - Dimitris Fragoudis, Dimitris Meretakis, Spiros Likothanassis:
Integrating feature and instance selection for text classification. 501-506 - Hichem Frigui:
SyMP: an efficient clustering approach to identify clusters of arbitrary shapes in large data sets. 507-512 - Shantanu Godbole, Sunita Sarawagi, Soumen Chakrabarti:
Scaling multi-class support vector machines using inter-class confusion. 513-518 - Tu Bao Ho, Trong Dung Nguyen, DucDung Nguyen:
Visualization support for a user-centered KDD process. 519-524 - Geoff Hulten, Pedro M. Domingos:
Mining complex models from arbitrarily large databases in constant time. 525-531 - Srinivasan Jagannathan, Jayanth Nayak, Kevin C. Almeroth, Markus Hofmann:
A model for discovering customer value for E-content. 532-537 - Xiaoming Jin, Yuchang Lu, Chunyi Shi:
Similarity measure based on partial information of time series. 544-549 - Eamonn J. Keogh, Stefano Lonardi, Bill Yuan-chi Chiu:
Finding surprising patterns in a time series database in linear time and space. 550-556 - Mahesh Kumar, Nitin R. Patel, Jonathan Woo:
Clustering seasonality patterns in the presence of errors. 557-563 - Cheng-Ru Lin, Chang-Hung Lee, Ming-Syan Chen, Philip S. Yu:
Distributed data mining in a chain store database of short transactions. 576-581 - Cheng-Ru Lin, Ming-Syan Chen:
A robust and efficient clustering algorithm based on cohesion self-merging. 582-587 - Bertis B. Little, Walter L. Johnston, Ashley C. Lovell, Roderick M. Rejesus, Steve A. Steed:
Collusion in the U.S. crop insurance program: applied data mining. 594-598 - Bhavani Raskutti, Herman L. Ferrá, Adam Kowalczyk:
Combining clustering and co-training to enhance text classification using unlabelled data. 620-625 - Naonori Ueda, Kazumi Saito:
Single-shot detection of multiple categories of text using parametric mixture models. 626-631 - Secil Ugurel, Robert Krovetz, C. Lee Giles:
What's the code?: automatic classification of source code archives. 639-644 - Jaideep Vaidya, Chris Clifton:
Privacy preserving association rule mining in vertically partitioned data. 639-644 - Michail Vlachos, Carlotta Domeniconi, Dimitrios Gunopulos, George Kollios, Nick Koudas:
Non-linear dimensionality reduction techniques for classification and visualization. 645-651 - Vasa Curcin, Moustafa Ghanem, Yike Guo, Martin Köhler, Anthony Rowe, Jameel Syed, Patrick Wendel:
Discovery net: towards a grid of knowledge discovery. 658-663 - Xintao Wu, Jianping Fan, Kalpathi R. Subramanian:
B-EM: a classifier incorporating bootstrap with EM approach for data mining. 670-675 - Kenji Yamanishi, Jun'ichi Takeuchi:
A unifying framework for detecting outliers and change points from non-stationary time series data. 676-681 - Yiling Yang, Xudong Guan, Jinyuan You:
CLOPE: a fast and effective clustering algorithm for transactional data. 682-687 - Bianca Zadrozny, Charles Elkan:
Transforming classifier scores into accurate multiclass probability estimates. 694-699