Analyzing the impact of genomic variants in SARS-CoV-2 on molecular diagnostic assays

Abhinav Jain
22nd November 2020


Coronavirus Disease 19 (COVID-19) originated from Wuhan in China and rapidly emerged as a global pandemic affecting millions of people worldwide. It is estimated that about 80% of individuals affected by COVID-19 are asymptomatic or have mild symptoms but act as a carrier and spread it to the other individuals[1]. This has necessitated the screening of the individuals even with low viral load with high sensitivity and specificity. Reverse transcription polymerase chain reaction (RT-PCR) has been considered as a gold standard and widely used for diagnosis and screening of COVID-19 patients[2,3]. This assay amplifies specific genomic loci in the SARS-CoV-2 viral genome identified by complementary oligonucleotide primers, by performing a polymerase chain reaction (PCR) and the readout is examined through specific fluorescence-labeled probes which hybridize to the amplified copies of DNA. 

Like all viruses, SARS-CoV-2 has been continuously evolving through accumulation of genetic mutations[4]. Our previous analysis has estimated the mutation rate around 1.64 x 10-3 variants per site per year [5]. While these variants could mean different things depending on where they occur, variations occurring at the primer or probe binding sites on the viral genome could potentially  impact the efficiency of the test/assay[6]. Additionally analysis of the genome and genomic variants could also provide insights into developing newer primers and probes in regions relatively devoid of variants. 

In a recent study[7] from our group, we have analysed variants from 45,830 high quality SARS-CoV-2 genomes isolates from around the globe (4,779 form Asia, 25,091 from Europe, 859 from Africa, 12,949 from North America, 791 from South America and 1,361 from Australia) and were deposited from multiple laboratories in the GISAID database (https://www.gisaid.org/)[8]. In total, we could catalogue a total of 132 primer or probe sequences that were widely used for RT-PCR testing. This activity was also contributed by undergraduate students at St Joseph’s College, Bengaluru highlighting a participatory approach towards involving students in research. This compendium was created as part of the COVID-19 Open Research initiative at our lab.

The SARS-CoV-2 genomic variants were mapped on these 132 primer or probe sequences. The mapping variants were further evaluated to understand the potential impact of the variants on the RT-PCR efficiency by calculating its melting temperature (Tm) and Gibbs free energy (ΔG037).     In the analysis, a total of 5,862 unique genetic variants were mapped to 132 primer or probe sequences. Out of which 29 variants had frequency ≥ 1% in at least one of the six continents where SARS-CoV-2 isolate. A total of 27 primer/probes had variants encompassing ≥1% of SARS-CoV-2 genomes isolates, out of which 11 were approved by regulatory agencies targeting N gene. Two regulatory bodies are the World Health Organization (WHO) and Centre of Disease Control and Prevention (CDC). 

These include four primer/probes as a part of the US Centers for Disease Control and Prevention (CDC) i.e. ACCCCGCATTACGTTTGGTGGACC (N1 probe), GCGCGACATTCCGAAGAA (N2 Reverse), TTACAAACATTGGCCGCAAA (N2 Forward) and GGGAGCCTTGAATACACCAAAA (N3 Forward) with cumulative variant frequency of 2.2%, 1%, 1%, and 1.14% respectively falls in the region of N gene [9].  A pair of primers sequence as a part of China CDC i.e. 2019-nCoV-NFP “GGGGAACTTCTCCTGCTAGAAT” and 2019-nCoV-NRP “CAGACATTTTGCTCTCAAGCTG” with cumulative variant frequency of 93.5% and 1.45% respectively[10]. While remaining five primer/probe sequence was approved by WHO i.e. NIID_2019-nCOV_N_P2 “ATGTCGCGCATTGGCATGGA”, “CAAGCCTCTTCTCGTTCCTC”,  N_Sarbeco_F1 “CACATTGGCACCCGCAATC”, N_Sarbeco_R1 “GAGGAACGAGAAGAGGCTTG”, and HKU-NP “GCAAATTGTGCAATTTGCGG” with cumulative variant frequency of 1%, 1.5%, 1.1%, 1.5%, and 1% respectively[11–13]. 

These findings could provide insights towards evidence based approvals of diagnostic assays. For example, regulatory agencies could monitor the frequency of the variants at the primer or probe sites that impacts the efficiency of the assays. Resources like IndiCoV (http://clingen.igib.res.in/indicov/) which systematically compiles genetic variants and annotations in Indian SARS-CoV-2 isolates could potentially aid this.

The findings could also potentially mean a lot to clinicians. In clinical settings, where a false negative RT-PCR test is suspected, revalidation on an alternate test primer/probe set could provide a solution. Alternatively more sensitive assays like sequencing could be used. Thirdly, this opens up a possibility to design better diagnostic assays targeting regions in the genome which are unlikely to have frequency variants opening up a new opportunity 

It is also imperative to note that more specific assays for detection of SARS-CoV-2 are in the anvil, including sequencing based approaches like COVIDseq. While not recommended for general surveillance, these assays could be used to offer confirmation in cases of inconclusive results, could be used in surveillance of individuals who are at high risk of transmitting to a large number of people, like clinicians, public-facing personnel like police etc who if tested false negative could potentially infect large number of people, and also could be used in people who for example take to air-travel and are likely to remain in a confined space and therefore likely to infect a large number of people. 


References
1. WHO COVID-19 Systematic Report 2020. https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200306-sitrep-46-covid-19.pdf?sfvrsn=96b04adf_4#:~:text=For%20COVID%2D19%2C,infections%2C%20requiring%20ventilation.
2. Shen, M. et al. Recent advances and perspectives of nucleic acid detection for coronavirus. J Pharm Anal (2020) doi:10.1016/j.jpha.2020.02.010.
3. Noh, J. Y. et al. Correction to: Simultaneous detection of severe acute respiratory syndrome, Middle East respiratory syndrome, and related bat coronaviruses by real-time reverse transcription PCR. Arch. Virol. 163, 819 (2018).
4. Li, X. et al. Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2. J. Med. Virol. 92, 602–611 (2020).
5. Banu, S. et al. A distinct phylogenetic cluster of Indian SARS-CoV-2 isolates. Open Forum Infectious Diseases (2020).
6. Yang, J.-R. et al. Newly emerging mutations in the matrix genes of the human influenza A(H1N1)pdm09 and A(H3N2) viruses reduce the detection sensitivity of real-time reverse transcription-PCR. J. Clin. Microbiol. 52, 76–82 (2014).
8. Shu, Y. & McCauley, J. GISAID: Global initiative on sharing all influenza data - from vision to reality. Euro Surveill. 22, (2017).
9. Lu, X. et al. US CDC Real-Time Reverse Transcription PCR Panel for Detection of Severe Acute Respiratory Syndrome Coronavirus 2 - Volume 26, Number 8—August 2020 - Emerging Infectious Diseases journal - CDC. doi:10.3201/eid2608.201246.
10. WHO in house assay, 2020 Summary table of available protocols in this document.  https://www.who.int/docs/default-source/coronaviruse/whoinhouseassays.pdf?sfvrsn=de3a76aa_2.
11. Detection of 2019 novel coronavirus (2019-nCoV) in suspected human cases by RT-PCR. https://www.who.int/docs/default-source/coronaviruse/peiris-protocol-16-1-20.pdf?sfvrsn=af1aac73_4.
12. Diagnostic detection of Wuhan coronavirus 2019 by real-time RT-PCR. https://www.who.int/docs/default-source/coronaviruse/wuhan-virus-assay-v1991527e5122341d99287a1b17c111902.pdf?sfvrsn=d381fc88_2.
13. Detection  of  second  case  of  2019-nCoV  infection  in Japan. https://www.who.int/docs/default-source/coronaviruse/method-niid-20200123-2.pdf?sfvrsn=fbf75320_7.





Abhinav Jain is a graduate student at CSIR-IGIB. All opinions expressed are personal and do not reflect the opinion of their employers or organisations associated. The Author can be reached at @Abhinav_Jain_19 on Twitter

Comments