This paper proposes a process that uses two cameras to obtain three-dimensional (3D) information of a target object for human tracking. Results of human detection and tracking from two cameras are integrated to obtain the 3D information. OpenPose is used for human detection. In the case of a general processing a stereo camera, a range image of the entire scene is acquired as precisely as possible, and then the range image is processed. However, there are problems such as incorrect matching and computational cost for the calibration process. A new stereo vision framework is proposed to cope with the problems. The effectiveness of the proposed framework and the method is verified through target-tracking experiments.