Oligonucleotide primers have been used to generate a cDNA library covering the entire tobacco mosaic virus (TMV) RNA sequence. Analysis of these clones has enabled us to complete the viral RNA sequence and to study its variability within a viral population. The positive strand coding sequence starts 69 nucleotides from the 5' end with a reading frame for a protein of Mr 125,941 and terminates with UAG. Readthrough of this terminator would give rise to a protein of Mr 183,253. Overlapping the terminal five codons of this readthrough reading frame is a second reading frame coding for a protein of Mr 29,987. This gene terminates two nucleotides before the initiator codon of the coat protein gene. Potential signal sequences responsible for the capping and synthesis of the coat protein and Mr 29,987 protein mRNAs have been identified. Similar sequences within these reading frames may be used in the expression of sets of proteins that share COOH-terminal sequences.