Human genomes vary quite a bit from individual to individual. These differences include single nucleotide changes, or “spelling mistakes” in the DNA sequence, but even more variation comes from structural variants, which include additions, deletions and rearrangements of large segments of DNA. A recent study used multiple advanced technologies to dive deeper than ever before to comprehensively characterize the structural variants present in three families.
Genome sequencing has become much faster, more accurate and less expensive over the past decade. As a result, more and more human genomes are being sequenced, and our knowledge of what those sequences actually mean for function and disease is growing rapidly.
It has also become clear that the more we learn, the more we appreciate just how little we still know about human genomics. While a linear sequence of ATGCs looks tidy, our genomes are actually dynamic entities that harbor considerable differences between individuals – differences that can alter traits contributing to both normal function and disease.
The genetic difference between individuals contributes to our individuality. These differences include millions of single nucleotide variants – where, for example, one person may have an A, another may have a C at a given position. There are also hundreds of thousands of structural variants (SVs). SVs include segments of DNA that are inserted into or deleted from the genome, segments that are duplicated, and segments that are inverted. SVs are more difficult to identify than single nucleotide variants, and hence it has been unclear just how many SVs really exist in a given human genome.
Now a paper entitled, “Multi-platform discovery of haplotype-resolved structural variation in human genomes,” published in Nature Communications, delves deeper into individual genomic differences than ever before.
The work involved a large international team of researchers from the Human Genome Structural Variation Consortium (HGSVC), including David Porubsky, Diana Spierings, Victor Guryev and Peter Lansdorp from the European Research Institute for the Biology of Ageing (ERIBA). They used a full suite of genomic technologies to extensively analyze the genomes of three family trios (parents and child). The technologies used include long-read, short-read, and strand-specific sequencing technologies, optical mapping and multiple computer algorithms for SV detection. The results present the most comprehensive catalog of SVs to date in the children’s genomes, including information on which set of parental chromosomes each SV was present on.
In summary, the researchers identified an average of 818,054 small insertions and deletions (genomic alterations that each affected less than 50 bases of DNA) and 27,622 SVs (genomic alterations that affected 50 bases or more of DNA) per genome. Remarkably, they also found an average of 156 inversions per genome, many of which intersected with genomic regions associated with genetic disease syndromes. Most of these inversions were only identified using the single cell Strand-seq method developed by the Lansdorp group. The more than 20 million nucleotides present in these newly identified inversions are in stark contrast to the much better studied 4-5 million “single nucleotide polymorphic (SNP)” nucleotides. The researchers found that more than 100,000 variants per individual are actually missed by routine sequencing technologies and commonly-used computer algorithms. For example, 83% of the insertions identified were missed by standard short-read-calling algorithms. In fact, the true numbers of SVs in a given human genome appears to be three- to seven-fold more than most studies typically identify.
Hence, SVs constitute a large amount of genetic variation not commonly captured by current genome sequencing technologies and analytical methods. This implies that the contribution of SVs to human disease has not yet been well-quantified and the expanded SV repertoire can help identify new genetic associations to diseases and improved diagnostic yields in future genetic tests.
About the European Research Institute for the Biology of Ageing (ERIBA)
The ERIBA is a newly established institute at the University Medical Center Groningen (UMCG). The mission of the ERIBA is to better understand what causes ageing. Our studies are focused on the mechanisms that result in loss of cells with age and the decline in the function of old cells and tissues. We aim to develop novel strategies to prevent or combat age-related disease and to provide evidence-based recommendations for healthy ageing. For more information, please visit www.eriba.umcg.nl.