Multi-object tracking systems often consist of a combination of a detector, a short term linker, a re-identification feature extractor and a solver that takes the output from these separate components and makes a final prediction. Differently, this work aims to unify all these in a single tracking system. Towards this, we propose Siamese Track-RCNN, a two stage detect-and-track framework which consists of three functional branches: (1) the detection branch localizes object instances; (2) the Siamese-based track branch estimates the object motion and (3) the object re-identification branch re-activates the previously terminated tracks when they re-emerge. We test our tracking system on two popular datasets of the MOTChallenge. Siamese Track-RCNN achieves significantly higher results than the state-of-the-art, while also being much more efficient, thanks to its unified design.