Targeted Genomic Newborn Screening Vs Whole Genome Sequencing

Every human genome consists of approximately 3 billion base pairs (bp) (or nucleotides) that make up approximately 20,000 different protein coding genes. Much of the human genome is non-coding, with about 98% of the genome not coding for functional proteins. That means only approximately 2% of the entire human genome is currently known to have DNA that could cause human disease.

Rapid sequencing of an individual’s entire genome is only possible with the development of next-generation DNA sequencing (NGS) technologies. Whole genome sequencing (WGS) aims to obtain data from all 3 billion bases of the genome, but most of the utility of WGS for finding DNA variants that cause highly penetrant monogenic inherited disease lies in the 1 to 2% coding region. Often, laboratory’s spend time and money to do WGS and then use computer algorithms to restrict their analysis to the coding regions only. 

Due to much of the non-coding sequence having no known function, one approach to reduce cost and computational time of WGS is to only sequence the coding regions (exons) of the genome, which mostly determine if a protein will work properly or not. To do this, the coding regions of the genome are isolated by PCR or probe hybridisation and only those regions are used for sequencing in an approach called whole exome sequencing (WES). Each gene on average is made up of around 9 exons, and most exons are around 200bp in length, meaning WES still generates a massive amount of data that comprises 180,000 exons or 30 million base pairs of raw sequencing data to completely sequence one exome at 1X coverage (every base is sequenced once). Typically, 20 to 30X coverage is required for accurate variant detection using NGS, so in practice every individual sequenced will generate 600 million base pairs of information to analyse and store. Processing and long-term storage and back up of such large volumes of data is a considerable hidden cost associated with WES and WGS.

The largest study to date of whole genome newborn screening is the BabySeq project where about 120 newborns had whole gene sequencing. There have been no prospective trials of whole exam genomic newborn screening to date. 

Targeted Gene Sequencing Advantages 

A more scalable and cost-effective option to WES and WGS is targeted gene sequencing (TGS), where a small panel of < 200 to 400 genes are sequenced. TGS can reduce the information burden for laboratories by up to 100-fold and can provide significant cost-savings for long term storage of genetic information. TGS methodology is similar to WES, except in instead of isolating every gene in the genome for analysis, a carefully selected panel of high clinical utility genes are chosen for sequencing. By avoiding sequencing of every gene, some ethical concerns about sequencing genes with poor disease associations or analysing genes that encode for untreatable adult-onset low penetrant disorders can be side-stepped. High clinical utility genes that encode for high penetrant monogenic disorders can be enriched, sequenced, and the data managed, for considerably less cost by TGS in comparison to WES or WGS. Moreover, Genepath has shown that TGS may be a more sensitive method for finding disease-causing variants in some genes compared to WGS and WES.  In addition Genepath has done targeted gene sequencing on 2552 newborn and demonstrated that this technology could be implemented into current newborn screening today.

Alex Davidson