One of the many new challenges for data analysis in the information age is the increasing concern of privacy protection. A particularly fruitful approach to data protection that has recently received a lot of attention, is the notion of `local differential privacy’. The idea is that each data providing individual releases only a randomly perturbed version of its original data, where the randomization mechanism is required to satisfy a precise privacy definition.
In this talk, we discuss the impact of a local differential privacy guarantee on the quality of statistical estimation. In this setup, the objective is not only to come up with an optimal estimation procedure that efficiently recovers information from the privatized observations, but also to devise a privatization mechanism that best facilitates subsequent estimation while respecting the required privacy provisions. In the general context of estimating linear functionals of the unknown true data generating distribution, we characterize the minimax rate of private estimation in terms of a certain modulus of continuity of the functional to be estimated and provide a construction of minimax rate optimal privatization mechanisms. Somewhat surprisingly, it can be shown that simple sample means of appropriately randomized observations are always optimal for estimating linear functionals. Our analysis also allows for a quantification of the price of local differential privacy in terms of loss of statistical accuracy. This price appears to be highly problem dependent.