One line of SOCO's research concerns probabilistic unsupervised models for high dimensional data clustering and visualization. Specifically, we are interested in latent, non-linear manifold learning models for data visualization and clustering, and matrix factorization methods for blind source separation, with application to real-world problems in the areas of clinical medicine, ecology, and business processes, including collaborations with the Neural Computation Group at Liverpool John Moores University (UK) [go] and the GABRMN [go] and Systems Pharmacology and Bioinformatics [go] research groups at Universitat Autònoma de Barcelona (UAB).
SOCO has carried out work on Feature Selection (FS) both for supervised and unsupervised models: An innovative software for FS in supervised models was developed after an exhaustive review of methods in the Machine Learning literature. It includes a synthetic data generator, an algorithm simulator and an automatic results quality evaluator. SOCO is currently working on a sequential algorithms simulator. Techniques for FS are being developed for Supervised Neural Networks and Support Vector Machines. As for the unsupervised methods, we are currently tackling the less common problem of feature selection in data clustering with mixture models. One of the goals of this type of FS is making compatible the assessment of feature relevance with the improvement of the interpretability of the clustering results through visualization.
SOCO works on data and knowledge visualization as a problem that involves artificial pattern recognition as a complement to the natural pattern recognition involved in human visual system. As such, this is a key element of exploratory data mining. Our work focuses on latent variable models, probabilisitic self-organizing systems and Gaussian Processes.
The first five describe methodologies with a similar goal: the resolution of complex problems that cannot be efficiently solved by means of traditional computational methods (hard computing). Feature Selection and Extraction deals with problems of data dimensionality reduction that are present in all the previous lines. Finally, Pattern Recognition and Computer Vision copes with a huge application field where the performance of AI systems is usually worse than the human one, and where the soft computing techniques have great potential. Related to it, Data and Knowledge Visualization concerns ways in which Soft Computing complements human vision in problems of exploratory Data Mining.
Please follow the links for further description.