Structural analysis of SARS-CoV-2 and prediction of the human interactome
Abstract
Specific elements of viral genomes regulate interactions within host cells. Here, we calculated the secondary structure content of >2500 coronaviruses and computed >100000 human protein interactions with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We found that the 3 and 5 prime ends are the most structured elements in the viral genome and the 5 prime end has the strongest propensity to associate with human proteins. The domain encompassing nucleotides 23000-24000 is highly conserved both at the sequence and structural level, while the region upstream varies significantly. These two sequences code for a domain of the viral protein Spike S that interacts with the human receptor angiotensin-converting enzyme 2 (ACE2) and has the potential to bind sialic acids. Our predictions indicate that the first 1000 nucleotides in the 5 prime end can interact with proteins involved in viral RNA processing such as double-stranded RNA specific editases and ATP-dependent RNA-helicases, in addition to other high-confidence candidate partners. These interactions, previously reported to be also implicated in HIV, reveal important information on host-virus interactions. The list of transcriptional and post-transcriptional elements recruited by SARS-CoV-2 genome provides clues on the biological pathways associated with gene expression changes in human cells.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2020
- DOI:
- 10.48550/arXiv.2003.13655
- arXiv:
- arXiv:2003.13655
- Bibcode:
- 2020arXiv200313655V
- Keywords:
-
- Quantitative Biology - Biomolecules;
- Quantitative Biology - Genomics;
- Quantitative Biology - Molecular Networks
- E-Print:
- 30 pages, 4 figures