The heart of modern machine learning (ML) is the approximation of high dimensional functions. Traditional approaches, such as approximation by piecewise polynomials, wavelets, or other linear combinations of fixed basis functions, suffer from the curse of dimensionality (CoD). We will present a mathematical perspective of ML, focusing on the issue of CoD. We will discuss three major issues: approximation theory and error analysis of modern ML models, dynamics and qualitative behavior of gradient descent algorithms, and ML from a continuous viewpoint. We will see that at the continuous level, ML can be formulated as a series of reasonably nice variational and PDE-like problems. Modern ML models/algorithms, such as the random feature and two-layer and residual neural network models, can all be viewed as special discretizations of such continuous problems. We will also present a framework that is suited for analyzing ML models and algorithms in high dimension, and present results that are free of CoD. Finally, we will discuss the fundamental reasons that are responsible for the success of modern ML, as well as the subtleties and mysteries that still remain to be understood.