Seismic tomography has been a vital tool in probing the Earth's internal structure and enhancing our knowledge of dynamical processes in the Earth's crust and mantle. While various tomographic techniques differ in data types utilized (e.g., body vs. surface waves), data sensitivity (ray vs. finite-frequency approximations), and choices of model parameterization and regularization, most global mantle tomographic models agree well at long wavelengths, owing to the presence and typical dimensions of cold subducted oceanic lithospheres and hot, ascending mantle plumes (e.g., in central Pacific and Africa). Structures at relatively small length scales remain controversial, though, as will be discussed in this paper, they are becoming increasingly resolvable with the fast expanding global and regional seismic networks and improved forward modeling and inversion techniques. This review paper aims to provide an overview of classical tomography methods, key debates pertaining to the resolution of mantle tomographic models, as well as to highlight recent theoretical and computational advances in forward-modeling methods that spearheaded the developments in accurate computation of sensitivity kernels and adjoint tomography. The first part of the paper is devoted to traditional traveltime and waveform tomography. While these approaches established a firm foundation for global and regional seismic tomography, data coverage and the use of approximate sensitivity kernels remained as key limiting factors in the resolution of the targeted structures. In comparison to classical tomography, adjoint tomography takes advantage of full 3D numerical simulations in forward modeling and, in many ways, revolutionizes the seismic imaging of heterogeneous structures with strong velocity contrasts. For this reason, this review provides details of the implementation, resolution and potential challenges of adjoint tomography. Further discussions of techniques that are presently popular in seismic array analysis, such as noise correlation functions, receiver functions, inverse scattering imaging, and the adaptation of adjoint tomography to these different datasets highlight the promising future of seismic tomography.