Tag Archives: clustering

Yinyang K-Means: A Drop-In Replacement of the Classic K-Means

This week; Yufei Ding, Yue Zhao, Xipeng Shen, Madanlal Musuvathi, and Todd Mytkowicz will be presenting Yinyang K-means at the 2015 International Conference on Machine Learning.

The algorithm guarantees the same results as traditional K-means, but it produces results with an order of magnitude higher performance.

An abstract of the paper and a PDF download can be accessed at Yinyang K-Means: A Drop-In Replacement of the Classic K-Means with Consistent Speedup.

Cloudera Machine Learning Slides

A very nice slidedeck from Jeff Hammerbacher of Cloudera. It goes over k-means clustering and some enhancements.

Free Textbook: Mining of Massive Datasets

A few professors from Stanford University have released version 1.1 of their textbook, Mining of Massive Datasets. The book has been created from materials used for a couple of Stanford computer science classes including large-scale data-mining and web mining. The book looks excellent and really focuses on the analysis of data at a large scale. Some people would use the word bigdata. Below is a list of some of the topics covered in the textbook.

  • data mining
  • map-reduce
  • clustering
  • recommender systems
  • and more

The book is free for download, or available from Cambridge University Press.

Machine Learning: Algorithms that Produce Clusters | Architects Zone

Machine Learning: Algorithms that Produce Clusters | Architects Zone.

The above article provides a nice brief overview of 5 clustering algorithms.

  1. K-Means
  2. Hierarchical Clustering
  3. Fuzzy C-Means
  4. Multi-Gaussian with Expectation-Maximization
  5. Density-based Cluster

This goes well with a previous post about 6 Machine Learning Algorithms.