This paper describes an attention-gating recurrent self-organising map approach for emergent speech representation. Inspired by evidence from human cognitive processing, the architecture combines two main neural components. The first component, the attention-gating mechanism, uses actor-critic learning to perform selective attention towards speech. Through this selective attention approach, the attention-gating mechanism controls access to working memory processing. The second component, the recurrent self-organising map memory, develops a temporal-distributed representation of speech using phone-like structures. Representing speech in terms of phonetic features in an emergent self-organised fashion, according to research on child cognitive development, recreates the approach found in infants. Using this representational approach, in a fashion similar to infants, should improve the performance of automatic recognition systems through aiding speech segmentation and fast word learning.