Online controlled experiments (also known as A/B Testing) have been viewed as a golden standard for large data-driven companies since the last few decades. The most common A/B testing framework adopted by many companies use "average treatment effect" (ATE) as statistics. However, it remains a difficult problem for companies to improve the power of detecting ATE while controlling "false discovery rate" (FDR) at a predetermined level. One of the most popular FDR-control algorithms is BH method, but BH method is only known to control FDR under restrictive positive dependence assumptions with a conservative bound. In this paper, we propose statistical methods that can systematically and accurately identify ATE, and demonstrate how they can work robustly with controlled low FDR but a higher power using both simulation and real-world experimentation data. Moreover, we discuss the scalability problem in detail and offer comparison of our paradigm to other more recent FDR control methods, e.g., knockoff, AdaPT procedure, etc.