India to launch its 1st human genome cataloguing project

India has More High-Quality Genomes than Global Average

Of the total, 99.7% have human origin — in India it is 100% — and 74% from humans are of ‘high quality’. Analysis of India data shows that 80% are high quality. Also, 32% of sequences from India have crucial patient data, compared to just 6% globally.

BENGALURU: Analysis of the largest depository of genome sequences globally shows India has a greater percentage of SARS-CoV2 genomes considered “high quality” compared to the global average. India also has a greater percentage of sequences with patient data.
Data from GISAID, a global science initiative that provides open-access to genomic data of influenza viruses and the novel coronavirus shows more than 2.9 lakh SARS-CoV2 genomes have been sequenced worldwide, including 4,238 from India.
Of the total, 99.7% have human origin — in India it is 100% — and 74% from humans are of ‘high quality’. Analysis of India data shows that 80% are high quality. Also, 32% of sequences from India have crucial patient data, compared to just 6% globally.
In fact, while only 1.5% of the total genomes on the global database are from India, nearly 8% of sequences with patient data are Indian.
This specific data analysis was done by Prof SS Vasan, who leads Covid-19 research at CSIRO, Australia. He told TOI: “It’s commendable that India is uploading a greater percentage of high quality coronavirus genomes. This is a great starting point from where India can lead by example.”
The data also shows that globally, the virus is sequenced in one out of roughly 270 cases, compared to roughly one out of 2,400 cases in India. “It shouldn’t be a problem if India sequences a fewer number of genomes if the quality continues to be high, there is sufficient annotation, and no bias in deciding which samples are chosen for sequencing,” Vasan said.
He said it would, however, be prudent to sequence all imported cases. Dr Giridhara Babu, member, ICMR task force on research and surveillance, said that while big data is key to understanding the virus better, the findings would gain significance if the mutation actually changes the character of the virus.
“There more than 4,000 mutations across the world, but they become significant if the mutation changes the amino acid that makes the virus behave differently as we’ve found in the UK and South Africa,” Babu said.
Dr V Ravi, member, DBT expert committee on Covid vaccine, while stating that there can be no end to wanting genetic sequences, said India has been doing well given the constraints.
“Sequencing one genome can cost between Rs 10,000 to Rs 12,000 if done in bulk and Rs 25,000 individually. Also, it is a high-skill job, I think we’ve been doing well,” Dr Ravi said.
Each genome sequence has 30,000 characters or letters and to consider any sequence as high quality, there are two broad parameters. “One at least 29,000 out of 30,000 letters must be sequenced and two, less than 1% of the sequences are ambiguous,” Vasan said.
He further said that sufficient annotation — de-identified patient metadata such as information on gender, age, comorbidities etc — and data with no bias are also key in drawing good conclusions using such data.
Indian sequences have been uploaded from Andhra Pradesh, Assam, Bihar, Delhi, Gujarat, Haryana, Karnataka, Ladakh, Madhya Pradesh, Maharashtra, Odisha, Punjab, Rajasthan, Tamil Nadu, Telangana, UP, Uttarakhand, West Bengal and J&K.


Source: PIB