This talk concerns a family of regression problems in a semi-supervised setting. The task is to assign real-valued labels to a set of n sample points, provided a small training subset of N labelled points. A goal of semi-supervised learning is to take advantage of the (geometric) structure provided by the large number of unlabelled data when assigning labels. In this talk the geometry is represented by the random geometric graph model with connection radius r(n). The framework is to consider objective functions which reward the regularity of the estimator function and impose or reward the agreement with the training data, more specifically we will consider discrete p-Laplacian regularization.
The talk concerns the asymptotic behaviour in the limit where the number of unlabelled points increases while the number of training points remains fixed. The results are to uncover a delicate interplay between the regularizing nature of the functionals considered and the nonlocality inherent to the graph constructions. I will give almost optimal ranges on the scaling of r(n) for the asymptotic consistency to hold. For standard approaches used thus far there is a restrictive upper bound on how quickly r(n) must converge to zero as n goes to infinity. I will present a new model which overcomes this restriction. It is as simple as the standard models, but converges as soon as r(n) → 0.
This is joint work with Dejan Slepcev (CMU).