Genome Sequencing for Epidemiology, Surveillance and Molecular contact-tracing in COVID-19



The declaration of COVID-19 as a global pandemic has necessitated the acute need for tools to be deployed for epidemiology, surveillance and molecular contact-tracing. The molecular tools for case detection have been heavily dependent on RT-PCR based methodologies, which target specific regions in the genome of SARS-nCoV-2, the viral pathogen causing COVID-19. While RT-PCR methods offer modest sensitivity and high specificity for case detection, they do not provide additional opportunities for understanding the molecular evolution of the virus.

Whole genome sequencing of SARS-nCoV-2 is now widely used across the globe to understand unique characteristics of the virus and also understand the origin of the virus. This is possible due to the fact that it is estimated that the virus has a modest mutation rate. As of date, over 10,000 whole genome sequences of SARS-nCoV-2 have been sequenced and deposited in public databases. The genome sequences provide the following additional opportunities and obviates some of the limitations of RT-PCR based assays.
  • Genome sequencing is not dependent on specific primers, and therefore provides an opportunity to identify the pathogen. In fact the identification of the unique virus which causes COVID-19 was through genome sequencing (Wu, F., Zhao, S., Yu, B. et al.2020).
  • Allow the understanding of evolution of the SARS-nCoV-2 virus. The evolution of the virus would be important to understand unique properties of the virus including pathogenicity, which are of clinical and epidemiological relevance and for public policy.
  • The mutation rate of the virus would allow effective tracing of the origins of the virus through molecular phylogenetics. As of date, over 10 clades of the virus have been characterized, with strong geographical affinities, thereby enabling identification of index cases, and contacts in clusters of patients, especially in community infections. See Figure 1 for transmission of the 10 different clades of SARS-nCoV-2 virus across geographies.
  • The genome map allows us to understand the genetic variants in critical genomic regions identified by diagnostic primers (see Genome Watchpost) as well as immunogenic sites - with implications in development of better diagnostics and better vaccines. 



Figure 1. SARS-nCoV-2 virus lineages and the global transmission across geographies. Image from nextstrain retrieved 26/04/2020

The advent of high-throughput next generation sequencing (NGS) technologies have made it possible to multiplex large numbers of viral genomes for sequencing . Coupled with standard pipelines for the analysis and assembly, this provides a unique opportunity to offer a scalable and affordable solutions for epidemiology and contact tracing across the country. An approach of sentinel surveillance using whole genome sequencing could provide insights into contacts and origins of disease in community infection, thereby providing the much required data to contain the spread of the disease.

References

Wu, F., Zhao, S., Yu, B. et al. A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269 (2020). https://doi.org/10.1038/s41586-020-2008-3

COVID-19 Genome Watchpost https://bit.ly/c19genomewatch
The COVID-19 Genome Watchpost provides concise summaries of the analysis of SARS-nCoV-2 genomes from around the world.

COVID-19 Genomepedia https://bit.ly/c19genomes 
The COVID-19 Genomepedia is a comprehesnive searchable resource on SARS-nCoV-2 genomes available in public domain

COVID-19 Open Research, Data and Resources
More about the initiative could be found at http://vinodscaria.rnabiology.org/covid-19


About the Author
Vinod Scaria is a Scientist at the CSIR Institute of Genomics and Integrative Biology (CSIR-IGIB). All views expressed are personal. The author can be contacted at vinods@igib.in 

Comments