The completion of sequencing the all 3 billion base pairs of the human genome has given birth to the field of bioinformatics. As a blending of the fields of molecular biology and computer science, this new region of inquiry is one both exciting and essential. If you’d like to explore what’s involved, this article will define the basic information and provide answers to some salient questions.
Exploring the Double Helix
As you might imagine, the sheer wealth of data derived from the Human Genome project was immense. It required both new methods of data storage and the expression of new ways in which it might be accessed and further explored. Even this may prove a simplistic understanding of the field of informatics when applied to biodata. Hence, to shed light on what might be involved in the acquisition and collation of such massive data sets, Luscombe, Greenbaum, and Gerstein define the field as,
“Analysis…[of] three types of large data sets…macromolecular structures, genome sequences, and the results of functional genomic experiments…’relationship data’ from metabolic pathways, taxonomy trees, and protein-protein interaction networks. [It] employs…computational techniques including…database design and data mining, mocromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering.”
As you might imagine, such a diverse and intricate new field requires individuals who are specialized in genetics and gene sequencing, as well as those who understand how to cull, group, and store the resultant data for future use and study. Bioinformatics provides the necessary structure for a nuanced exploration of gene-based therapies and medical treatments, a greater understanding of the roots of heritable conditions, and the potential to explore not only our own beginnings, but those of other plant and animal species.
Origins of Understanding
While the field itself has experienced an unprecedented rate of growth since the completion of the human genome project and related endeavors, the term has a few decades of history behind it. Paulien Hogeweg coined the term in 1979 as she studied the way computer science might be applied to the study of biological systems. By creating algorithms and software specifically geared to examine, correlate, and group biodata, specifically that of genetic, macromolecular, and metabolic processes, a new hybrid of computer science was formed.
This startling brilliance allowed researchers to address complex problems in human medicine that could be studied in no other way simply because the overabundance of data and the infinitesimal scale of the material being observed.
It brings statistical analysis and computer programming together, since the data sets in question are massive, unwieldy, and defy the limited scope of the human memory to effectively analyze them. With this amazing tool set at their avail, researchers can analyze complex and extensive genome sequences, better understand the synthesis and employment of peptides, and predict the form of protein structures formed at the molecular level.
But it also allows scientists to construct intergenomic maps, which permit comparison and exploration of variation between species. This is perhaps one of the more novel and potentially beneficial purposes of the field—to understand how humans acquired specific sequences, construct a “family tree” of life on Earth, and explore potential gene therapies not previously imagined. As such, the field of bioinformatics is crucial to a deeper understanding of the building blocks of our very lives, and serves as a major component in the fight against debilitating genetic disorders previously considered fatal.