Unsupervised Leaning
Lecture Notes
These lecture notes cover all the topics of the course for the unsupervised learning part. You must read the chapter corresponding to the lecture of the week to be able to follow the class and to work with the practical exercises solved during the class.
This is a link to the python notebooks used in class.
These lecture notes cover all the topics of the course for the unsupervised learning part. You must read the chapter corresponding to the lecture of the week to be able to follow the class and to work with the practical exercises solved during the class.
This is a link to the python notebooks used in class.
Slides and other material
- Data Mining, a global perspective
- Data preprocessing/transformation
- Slides Data
preprocessing/Transformation
- Algorithms for Clustering Data, Chapter 2 , Jain & Dubes
- Continuous attributes discretization
- Dimensionality Reduction: PCA, ICA, Multidimensional Scalling, Random Projection, Locally linear embedding, ISOMAP
- Dimensionality reduction: Checkout also the sections 5 to 9 of chapter 14 of the book "The Elements of Statistical Learning" Hastie, Tibshirani, Friedman
- Datasets: authors, wheel
- Slides Data
preprocessing/Transformation
- Unsupervised machine learning
- Slides about
unsupervised learning algorithms
- Slides about clustering evaluation
- Unsupervised learning
- Data Clustering: A Review (1999) A K Jain, M N Murty, P J Flynn ACM Computing Surveys
- Algorithms for Clustering Data, Capitulo 3 , Jain & Dubes
- Survey of clustering data mining techniques Pavel Berkhin
- Checkout also sections 1 to 8 of of the book "The Elements of Statistical Learning" Hastie, Tibshirani, Friedman
- Models of incremental concept formation J. Gennari, P. Langley, D. Fisher
- A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise (DBSCAN) Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu
- STING: A statistical information GRID approach to spatial datamining Wang, Yang, Muntz
- An efficient approach to clustering in large multimedia databases witn noise (DENCLUE) Hinnenburg, Keim
- Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications (CLIQUE) (1998) Rakesh Agrawal Johannes Gehrke Dimitrios Gunopulos Prabhakar Raghavan
- Evaluaton of clustering
- Comparison of clusterings
- Datasets: City
- Slides about
unsupervised learning algorithms
- Unsupervised methodologies in Knowledge Discovery and Data Mining
- Slides Clustering methodologies in Knowledge discovery
- Unsupervised methodologies in Data Mining
- Scaling Clustering Algorithms to Large Databases (1998) P.S. Bradley, Usama Fayyad, Cory Reina Knowledge Discovery and Data Mining
- CLARANS: A Method for Clustering Objects for Spatial Data Mining Raymond T. Ng, Jiawei Han
- BIRCH: An Efficient Data Clustering Method for Very Large Databases (1996) Tian Zhang, Raghu Ramakrishnan, Miron Livny
- CURE: An efficient clustering algorithm for large databases Sudipto Guha , Rajeev Rastogi , and Kyuseok Shim
- CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling George Karypis, Eui-Hong (Sam) Han, Vipin Kumar
- Semi supervised Clustering
- Introduction
to semi supervised Clustering
- S. Basu, A. Banerjee, R. Mooney,"semi-supervised
clustering by seeding" ,
ICML-2002
- S. Basu, M. Bilenko, R. Mooney,"A probabilistic framework for semi-supervised clustering", 10th ACM SIGKDD 2004
- D. Cohn, R. Caruana, A. McCallum, "Semi-supervised
Clustering with user feedback", TR2003-1892, Cornell University,
2003
- S. Basu, A. Banerjee, R. Mooney,"semi-supervised
clustering by seeding" ,
ICML-2002
- Other Topics in Clustering
- Association Rules
- Slides Association Rules
- Interest indices for association rules
- Checkout the sample chapter about association rules from the book Introduction to Data Mining Tan Steinbach, Kumar
- Association Rule Mining: A Survey Qiankun Zhao Sourav S. Bhowmick
- Mining sequential and structure data
- Slides Mining sequential and structured data
- Discovery of frequent episodes in event sequences Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. Data Mining and Knowledge Discovery 1(3): 259 - 289, November 1997.
- Jaak Vilo Discovering Frequent Patterns from Strings. Technical Report C-1998-9 (pp. 20) May 1998. Department of Computer Science, University of Helsinki.
- Mining Sequential Patterns: Generalizations and Performance Improvements Ramakrishnan Srikant, Rakesh Agrawal (1996)
- Mining Sequential Patterns Rakesh Agrawal, Ramakrishnan Srikant (1995)
- An Apriori-based Algorithm for Mining Frequent Substructures from Graph Data Akihiro Inokuchi, Takashi Washio and Hiroshi Motoda
- gSpan: Graph-Based Substructure Pattern Mining. Xifeng Yan, Jiawei Han. UIUC Technical Report, UIUCDCS-R-2002-2296, 2002.
- Frequent Free Tree Discovery in Graph Data Ulrich Rückert, Stefan Kramer In: Special Track on Data Mining, ACM Symposium on Applied Computing (SAC2004), 2004.