TAVA/AAAC teoría

Unsupervised Leaning

Syllabus

Machine Learning/Inductive Learning
- Slides Introduction to machine learning
- Slides Introduction to inductive learning
- Introduction to machine learning, Nils J. Nilsson. Chapter 1
Data Mining, a global perspective
- Slides Introduction to Data Mining
- From Data Mining To knowledge Discovery in Databases, Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth
Data preprocessing/transformation
- Slides Data preprocessing/Transformation
- Algorithms for Clustering Data, Chapter 2 , Jain & Dubes
- Continuous attributes discretization
- Dimensionality Reduction: PCA, ICA, Multidimensional Scalling, Random Projection, Locally linear embedding, ISOMAP
- Dimensionality reduction: Checkout also the sections 5 to 9 of chapter 14 of the book "The Elements of Statistical Learning" Hastie, Tibshirani, Friedman
Numerical taxonomy, an statitstical approach. Unsupervised machine learning, an artificial intelligence approach
- Slides about numerical taxonomy and unsupervised learning
- Slides about clustering evaluation
Semi supervised Clustering

Introduction to semi supervised Clustering
- S. Basu, A. Banerjee, R. Mooney,"semi-supervised clustering by seeding" , ICML-2002
- S. Basu, M. Bilenko, R. Mooney,"A probabilistic framework for semi-supervised clustering", 10th ACM SIGKDD 2004
- D. Cohn, R. Caruana, A. McCallum, "Semi-supervised Clustering with user feedback", TR2003-1892, Cornell University, 2003
Domain Theories

J. Bejar, Improving Knowledge Discovery using Domain Knowledge in Unsupervised Learning , ECML 2000

Unsupervised methodologies in Knowledge Discovery and Data Mining

Slides Clustering methodologies in Knowledge discovery
Slides Association Rules (examples)
Slides Mining sequential and structured data

Unsupervised methodologies in Data Mining
- Survey of clustering data mining techniques Pavel Berkhin
- Scaling Clustering Algorithms to Large Databases (1998) P.S. Bradley, Usama Fayyad, Cory Reina Knowledge Discovery and Data Mining
- CLARANS: A Method for Clustering Objects for Spatial Data Mining Raymond T. Ng, Jiawei Han
- BIRCH: An Efficient Data Clustering Method for Very Large Databases (1996) Tian Zhang, Raghu Ramakrishnan, Miron Livny
- CURE: An efficient clustering algorithm for large databases Sudipto Guha , Rajeev Rastogi , and Kyuseok Shim
- ROCK: A ROBUST CLUSTERING ALGORITHM FOR CATEGORICAL ATTRIBUTES Sudipto Guha , Rajeev Rastogi , and Kyuseok Shim
- CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling George Karypis, Eui-Hong (Sam) Han, Vipin Kumar
- A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise (DBSCAN) Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu
- OPTICS: Ordering Points To Identify the Clustering Structure Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, J&g Sander
- STING: A statistical information GRID approach to spatial datamining Wang, Yang, Muntz
- An efficient approach to clustering in large multimedia databases witn noise (DENCLUE) Hinnenburg, Keim
- Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications (CLIQUE) (1998) Rakesh Agrawal Johannes Gehrke Dimitrios Gunopulos Prabhakar Raghavan
- MAFIA: Efficient and scalable subspace clustering for very large datasets S. Goil, H. Nagesh, A. Choudhary

Association Rules
Mining sequential and structure data
- Discovery of frequent episodes in event sequences Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. Data Mining and Knowledge Discovery 1(3): 259 - 289, November 1997.
- Jaak Vilo Discovering Frequent Patterns from Strings. Technical Report C-1998-9 (pp. 20) May 1998. Department of Computer Science, University of Helsinki.
- Mining Sequential Patterns: Generalizations and Performance Improvements Ramakrishnan Srikant, Rakesh Agrawal (1996)
- Mining Sequential Patterns Rakesh Agrawal, Ramakrishnan Srikant (1995)
- An Apriori-based Algorithm for Mining Frequent Substructures from Graph Data Akihiro Inokuchi, Takashi Washio and Hiroshi Motoda
- gSpan: Graph-Based Substructure Pattern Mining. Xifeng Yan, Jiawei Han. UIUC Technical Report, UIUCDCS-R-2002-2296, 2002.
- Frequent Free Tree Discovery in Graph Data Ulrich Rückert, Stefan Kramer In: Special Track on Data Mining, ACM Symposium on Applied Computing (SAC2004), 2004.

Bayesian networks
- Slides Learning of bayesian networks (Examples)
- Learning Probabilistic Networks Paul J Krause
Additional themes