The completion of sequencing the all 3 billion base pairs of the human genome has given birth to the field of bioinformatics. As a blending of the fields of molecular biology and computer science, this new region of inquiry is one both exciting and essential. If you’d like to explore what’s involved, this article will provide the basic information and answers to some salient questions.
Exploring the Double Helix
As you might imagine, the sheer wealth of data derived from the Human Genome project was immense. It required both new methods of data storage and new ways in which it might be accessed and further explored. Even this may prove a simplistic understanding of the field of informatics when applied to biodata. Hence, to shed light on what might be involved in the acquisition and collation of such massive data sets, Luscombe, Greenbaum, and Gerstein define the field as,
“Analysis…[of] three types of large data sets…macromolecular structures, genome sequences, and the results of functional genomic experiments…’relationship data’ from metabolic pathways, taxonomy trees, and protein-protein interaction networks. [It] employs…computational techniques including…database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering.”
As you might imagine, such a diverse and intricate new field requires individuals who are specialized in genetics and gene sequencing. They also need to understand how to cull, group, and store the resultant data for future use and study. Bioinformatics provides the necessary structure for:
- a nuanced exploration of gene-based therapies and medical treatments
- a greater understanding of the roots of heritable conditions
- the potential to explore not only our own beginnings, but those of other plant and animal species.
Origins of Understanding
The field itself has experienced an unprecedented rate of growth since the completion of the human genome project and related endeavors. However, the term has a few decades of history behind it. Paulien Hogeweg coined the term in 1979 as she studied the way computer science might be applied to the study of biological systems. She created algorithms and software specifically geared to examine, correlate, and group biodata. Specifically were genetic, macromolecular, and metabolic processes. A new hybrid of computer science was formed.
This startling brilliance allowed researchers to address complex problems in human medicine that could be studied in no other way. Why? Simply because the overabundance of data and the infinitesimal scale of the material being observed.
It brings statistical analysis and computer programming together, since the data sets in question are massive and unwieldy. They defy the limited scope of the human memory to effectively analyze them. With this amazing tool set, researchers can:
- analyze complex and extensive genome sequences
- better understand the synthesis and employment of peptides
- predict the form of protein structures formed at the molecular level
But it also allows scientists to construct intergenomic maps, which permit comparison and exploration of variation between species. This is perhaps one of the more novel and potentially beneficial purposes of the field:
- to understand how humans acquired specific sequences
- construct a “family tree” of life on Earth
- explore potential gene therapies not previously imagined
As such, the field of bioinformatics is crucial to a deeper understanding of the building blocks of our very lives. It serves as a major component in the fight against debilitating genetic disorders previously considered fatal.