This work focuses on synthesizing human poses from human-level text descriptions. We propose a model that is based on a conditional generative adversarial network. It is designed to generate 2D human poses conditioned on human-written text descriptions. The model is trained and evaluated using the COCO dataset, which consists of images capturing complex everyday scenes with various human poses. We show through qualitative and quantitative results that the model is capable of synthesizing plausible poses matching the given text, indicating that it is possible to generate poses that are consistent with the given semantic features, especially for actions with distinctive poses.