We consider control of heterogeneous players repeatedly playing an anti-coordination network game. In an anti-coordination game, each player has an incentive to differentiate its action from its neighbors. At each round of play, players take actions according to a learning algorithm that mimics the iterated elimination of strictly dominated strategies. We show that the learning dynamics may fail to reach anti-coordination in certain scenarios. We formulate an optimization problem with the objective to reach maximum anti-coordination while minimizing the number of players to control. We consider both static and dynamic control policy formulations. Relating the problem to a minimum vertex cover problem on bipartite networks, we develop a feasible dynamic policy that is efficient to compute. Solving for optimal policies on benchmark networks show that the vertex cover based policy can be a loose upper bound when there is a potential to make use of cascades caused by the learning dynamics of uncontrolled players. We propose an algorithm that finds feasible, though possibly suboptimal, policies by sequentially adding players to control considering their cascade potential. Numerical experiments on random networks show the cascade-based algorithm can lower the control effort significantly compared to simpler control schemes.