Research in Data Clustering

Recent Publications

J. Yi, L. Zhang, R. Jin, Q. Qian, A. K. Jain "Semi-supervised Clustering by Input Pattern Assisted Pairwise Similarity Matrix Completion", ICML, 2013.

J. Yi, T. Yang, R. Jin, A. K. Jain "Robust Ensemble Clustering by Matrix Completion" , ICDM, Brussels, Belgium, Dec 10-13, 2012.

R. Chitta, R. Jin,, A. K. Jain, "Efficient Kernel Clustering using Random Fourier Features" , ICDM, Brussels, Belgium, Dec 10-13, 2012.

J. Yi, R. Jin, A. K. Jain, S. Jain, Y. Tang, "Semi-Crowdsourced Clustering: Generalizing Crowd Labeling by Robust Distance Metric Learning", NIPS, Lake Tahoe, NE, Dec 3-6, 2012.

J. Yi, R. Jin, A. K. Jain, S. Jain, "Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach" , HCOMP,Toronto ,Canada ,July 22 -23,2012 .

R. Chitta, R. Jin, T. C. Havens, A. K. Jain, "Approximate Kernel k-means: solution to Large Scale Kernel Clustering" , KDD, San Diego, CA, Aug 21-24, 2011.

T. C. Havens, R. Chitta, A. K. Jain, R. Jin, "Speedup of Fuzzy and Possibilistic Kernel c-Means for Large-Scale Clustering" , Int. Conf. Fuzzy Systems, Taipei, Taiwan, June 27-30, 2011.

T. Yang, R. Jin, A. K. Jain, Y. Zhou, W. Tong, "Unsupervised Transfer Classification: Application to Text Categorization" , KDD, Washington, DC, July 25-28, 2010.

T. Yang, R. Jin, A. K. Jain, "Learning from Noisy Side Information by Generalized Maximum Entropy Model" , ICML, Haifa Israel, June 21-24, 2010.

P. Mallapragada, R. Jin and A. K. Jain, "Non-parametric Mixture Models for Clustering" , SSSPR, Izmir Turkey, August 18-20, 2010.

A. Lourenço, A. L. N. Fred and A. K. Jain, "On the Scalability of Evidence Accumulation Clustering" , ICPR, Istanbul Turkey, August 23-26, 2010.

S. S. Bucak, P. K. Mallapragada, R. Jin and A. K. Jain, "Efficient Multi-label Ranking for Multi-class Learning: Application to Object Recognition" ,Proceedings of the 12th IEEE International Conference on Computer Vision (ICCV 2009), 2009

Pavan K. Mallapragada, Rong Jin, and A. K. Jain, "Online Visual Vocabulary Pruning Using Pairwise Constraints" , CVPR, San Francisco, 2010

A. K. Jain, "Data Clustering: 50 Years Beyond K-Means" , Pattern Recognition Letters, Vol. 31, No. 8, pp. 651-666, 2010.

Anil K. Jain, Data Clustering : 50 Years Beyond K-Means, Technical Report TR-CSE-09-11. (Published in the Pattern Recognition Letters 2010).

J-E. Lee, R. Jin and A. K. Jain," Rank-based Distance Metric Learning: An Application to Image Retrieval , CVPR, June 2008.

H. Valizadegan, R. Jin, Anil K.Jain, Semi-supervised Boosting for Multi-Class Classification , To appear in 19th European Conference on Machine Learning(ECML 2008), Antwerp, Belgium, September 15-19, 2008

Pavan K. Mallapragada , Rong Jin , A. K. Jain , Yi Liu . "SemiBoost: Boosting for Semi-supervised Learning", Transactions on Pattern Analysis and Machine Intelligence (to appear).

A. L. N. Fred and A. K. Jain, Cluster Validation Using a Probabilistic Attributed Graph , in Proc. of International Conference on Pattern Recognition (ICPR), Tampa, December, 2008.

P. K. Mallapragada, R. Jin and A. K. Jain, Active Query Selection for Semi-Supervised Clustering, in Proc. of International Conference on Pattern Recognition (ICPR), Tampa, December, 2008.

T. Lange, M. Law, A. K. Jain and M. Buhmann, Clustering with Constraints: A Mean-field Approximation Perspective, in Constrained Clustering, S. Basu, I. Davison and K. L. Wagstaff (eds.), CRC Press, 2008.

Y. Liu, R. Jin and A. K. Jain, BoostCluster: Boosting Clustering by Pairwise Constraints, in Proc. 13th International Conference on Knowledge Discovery and Data Mining (KDD), pp. 450-459, San Jose, USA, August 2007.

Recent Talks

Data Clustering: 50 Years Beyond K-means, SDM 2010 Workshop on Clustering: Theory and applications, May 1, 2010 (slides)
King Sun Fu Lecture, "Data Clustering: 50 Years Beyond K-means", ICPR, Dec 8, 2008 (slides, paper) (Biography of Prof. King Sun Fu)
Data Clustering: 50 Years Beyond K-means ECML Sept. 2008 (slides, video)

Books and Surveys

A. K. Jain, "Data Clustering: 50 Years Beyond K-Means" , Pattern Recognition Letters, Vol. 31, No. 8, pp. 651-666, 2010.
A. K. Jain, M.N. Murthy and P.J. Flynn, Data Clustering: A Review, ACM Computing Reviews, Nov 1999.
A. K. Jain and R. C. Dubes. Algorithms for Clustering Data, Prentice Hall, 1988.
(This book is out of print, but it can be downloaded for free by following the hyperlink above.)

Publications by Topic

Software for Download

	M. Figueiredo , A.K. Jain , "Unsupervised Learning of Finite Mixture Models ", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 3, March 2002, pp. 381-396. (Matlab code ) (abstract at IEEE Explore )
	M. Law , M. A. T. Figueiredo , A. K. Jain . "Simultaneous Feature Selection and Clustering Using Mixture Models ", IEEE Transactions of Pattern Analysis and Machine Intelligence. vol. 26, no. 9, pp. 1154- 1166, September 2004. (IEEE Xplore ) (Matlab code )