Structural learning, a method to estimate the parameters for discrete energy minimization, has been proven to be effective in solving computer vision problems, especially in 3D scene parsing. As the complexity of the models increases, structural learning algorithms turn to approximate inference to retain tractability. Unfortunately, such methods often fail because the approximation can be arbitrarily poor. In this work, we propose a method to overcome this limitation through exploiting the properties of the joint problem of training time inference and learning. With the help of the learning framework, we transform the inapproximable inference problem into a polynomial time solvable one, thereby enabling tractable exact inference while still allowing an arbitrary graph structure and full potential interactions. Our learning algorithm is guaranteed to return a solution with a bounded error to the global optimal within the feasible parameter space. We demonstrate the effectiveness of this method on two point cloud scene parsing datasets. Our approach runs much faster and solves a problem that is intractable for previous, well-known approaches.