MINERIA DE DADES (MD) / DATA MINING (DM)


                        Year 2012/2013 1st semester


Last modification: November 15, 2012

NEWS!!!!! 

The document for the 3rd practical work has been updated


please consult the Racó.

Important dates



General information


Lecturers
: Lluís Belanche Muñoz (LSI dept.), Karina Gibert (EIO dept.)

Desks: Omega-326 (Lluís), C5-219 (Karina)

Consulting times: Wednesdays and Fridays 12h to 14h (Lluís) - please warn before coming!

Have a look at the Course Information


Handouts

                        SLIDES               R DOCUMENTS
                                R FILES

  • DM Introduction
  • Visualization Complementary
  • A first session in R

    Credsco profiling in R

    DM Profiling


    Credsco visualisation

    Slides on Clustering

    Introduction to 'arules'

    Decision trees for Credsco

    Slides on Visualization


    The 'MASS' package

    Credsco clustering

    Slides on Decision trees

    The 'knnflex' package

    Logistic regression for Credsco

    Slides on Linear Models

    The 'nnet' package

    Example R file for MBA

    Slides on Logistic Regression

    The e1071 package

    Example R file for LDA and QDA + Relief

    Slides on Association rules
    SVMs in R
    Example R file for Naïve Bayes and kNN
    Slides on Bayes classifiers (I)

    NNET Labo
    Slides on Bayes classifiers (II)
    Example R file for SVMs
    Slides on Bayes classifiers (II) aux (*)
    Feature Selection Labo
    Slides on Naïve Bayes

    LDA/QDA Labo
    Slides on kNN


    Slides on kNN aux (*)


    Slides on ANNs (I)


    Slides on ANNs (II)


    Slides on ANNs (III)


    Slides on ANNs (IV)


    Slides on ANNs (V)


    Slides on SVMs


    (*) I would like to thank Ricardo Gutiérrez Osuna for his excellent course notes on Pattern Classification


    Applets




    Additional R stuff


    Materials for the practical work

    The first and second practical works use the ALPHA data set and consists on three steps:

    1. statistical analysis, visualization and profiling
    2. cluster analysis
    3. association analysis

    The third practical work uses a real-world data set of your choice and consists on three steps:

    1. statistical analysis and pre-processing
    2. feature selection and/or extraction
    3. modeling and prediction

    Other data sets:

    Various


    Page maintained by Lluís A. Belanche

    [UPC Home Page | LSI Home Page | EIO Home Page | FIB Home Page]