For our research in
Pattern Recognition and Image Processing, visit the
PRIP
page
For our research in
biometrics, visit the Biometrics
page
Large Scale KernelBased Data Clustering 
Kernelbased clustering algorithms achieve better performance on real world data than the Euclidean
distancebased clustering algorithms, but pose two important challenges: (i) they do not scale sufficiently in terms of runtime and
memory complexity, i.e. their complexity is quadratic in the number of data instances, rendering them inefficient for large data
sets containing millions of data points, and (ii) the choice of the kernel function is very critical to the performance of the
algorithm. In this project, we aim at developing efficient schemes to reduce the complexity of these clustering algorithms and learn
appropriate kernel functions from the data. We employ matrix approximation techniques based on randomization to achieve speedup and
reduce the memory requirements of kernelbased clustering. We evaluate the efficiency of our techniques in the domains of object
categorization and document clustering. 

Crowdclustering 
One of the main challenges in data clustering is to define an appropriate similarity measure between two objects. Crowdclustering addresses this challenge by
defining the pairwise similarity based on the manual annotations obtained through
crowdsourcing.


Semisupervised Boosting 


Model Based Clustering 

Multiobjective Data Clustering 

Cluster Ensembles 
Combination of multiple classifiers in supervised classification has achieved great success and it is becoming one of the standard techniques in pattern recognition. However, little has been done to explore how to combine data partitions generated by different clustering algorithms. The following papers investigate different issues on combining the outputs of multiple clustering algorithms.

Dimensionality Reduction 

Semisupervised Clustering 

Feature Selection in Unsupervised Learning 

Other Clustering Related Papers 

Recent Theses 
