Variable clustering

With Nicolas Verzelen (INRA)

Variable clustering: optimal bounds and a convex approach

The problem of variable clustering is that of grouping similar components of a p-dimensional vector X = (X_1 , … , X_p), and estimating these groups from n independent copies of X. Although K-means is a natural strategy for this problem, I will explain why it cannot lead to perfect cluster recovery. Then, I will introduce a correction that can be viewed as a penalized convex relaxation of K-means. The clusters estimated by this method are shown to recover the partition G at a minimax optimal cluster separation rate.

Add to your calendar or Include in your list

How can mathematics help us to understand the behaviour of ants? Read more about the fanscinating work being carri… View on Twitter