The Most Influential Scientists in the Development of Medical informatics (13): Margaret Belle Dayhoff
Margaret Belle (Oakley) Dayhoff (1925-1983) was an American physical chemist and a pioneer in the field of bioinformatics. She dedicated her career to applying the evolving computational technologies to support advances in biology and medicine, most notably the creation of protein and nucleic acid databases and tools to interrogate the databases. Dayhoff graduated from New York University in 1945 with a bachelor of arts and earned a PhD. in quantum chemistry in 1948 at Columbia University. She was a research assistant at the Rockefeller Institute from 1948 to 1951 and had been associate director of the National Biomedical Research Foundation in Washington, DC, since 1960. Dr. Dayhoff was widely known in the scientific community for establishing a large computer data base of protein structures as well as for being the author of the Atlas of Protein Sequence and Structure, a multivolume reference work. She initiated this collection of protein sequences in the Atlas, a book collecting all known protein sequences that she published in 1965. It was subsequently republished in several editions. This led to the Protein Information Resource database of protein sequences, which was developed by her group. It and the parallel effort by Walter Goad which led to the GenBank database of nucleic acid sequences are the twin origins of the modern databases of molecular sequences. The Atlas was organized by gene families, and she is regarded as a pioneer in their recognition. Her approach to proteins was always determinedly evolutionary. Her work is used in genetic engineering and medical research. As a noted archivist of proteins, Dr. Dayhoff contributed to the understanding of the evolutionary process by developing evolutionary “trees” based on correlations between proteins and living organisms. She and her staff made several discoveries, including one indicating that certain genes normally found in most body tissue cells are closely related to genes found in many cancer cells. She did postdoctoral studies at the Rockefeller Institute (now Rockefeller University) and the University of Maryland, and joined the newly established National Biomedical Research Foundation in 1959. She was the first woman to hold office in the Biophysical Society. She originated one of the first substitution matrices, Point accepted mutations (PAM). The oneletter code used for amino acids was developed by her, reflecting an attempt to reduce the size of the data files used to describe amino acid sequences in an era of punch-card computing.