Berlin Buzzwords 2013: Dan Filimon - Clustering of Real-time Data at Scale #bbuzz

preview_player
Показать описание
Last year at Buzzwords it was reported the Apache Mahout project had a new kind of clustering algorithm soon to be available which promised extraordinary speed. Since that time, that promise has been filled. This new algorithm is extraordinarily fast, possibly the fastest production clustering algorithm available. It also has many unusual characteristics which can make clustering applicable in new ways.

This talk is a report on the progress of this new kind of clustering. I will describe the theory behind how this algorithm works and how it is able to provide high quality clustering with only a single pass through the data. Mostly, however, I will focus on practical results of this algorithm.

Read more:

About Dan Filimon:

Рекомендации по теме