Emerging from the optimal transportation problem and due to their favorable geometric properties, Wasserstein distances have recently attracted ample attention from the machine learning and signal processing communities. Wasserstein distances have been used in supervised, semi-supervised, and unsupervised learning problems, as well as in domain adaptation and transfer learning. However, the application of Wasserstein distances to high-dimensional probability measures is often hindered by their expensive computational cost. Sliced-Wasserstein (SW) distances, on the other hand, have similar qualitative properties to the Wasserstein distances but are significantly simpler to compute. The simplicity of computation of this distance has motivated recent work to use SW as a substitute for the Wasserstein distances. In this presentation, I first review the mathematical concepts behind sliced Wasserstein distances. Then I introduce an entire class of new distances, denoted as Generalized Sliced-Wasserstein (GSW) distances, that extends the idea of linear slicing used in SW distances to general non-linear slicing of probability measures. Finally, I will review various applications of SW and GSW in deep generative modeling and transfer learning.