Principal component analysis (PCA) is a ubiquitous method in statistics, machine learning and applied mathematics. PCA has been well studied and used mostly in the homoskedastic noise case.
In this talk, we consider PCA in the setting where the noise is heteroskedastic, which arises naturally from a range of applications. We proposed an algorithm called DIALECT for heteroskedastic PCA and establish its optimality. A key technical step is a deterministic robust perturbation analysis, which can be of independent interest. We will also discuss some applications in the analysis of high-dimensional data, including heteroskedastic matrix SVD , community detection in bipartite stochastic block model, and noisy matrix completion.