Structural Biology Symposium

Summer 99

Protein structure in the 21st Century
Helen M. Berman, (Rutgers University) described how the new "Research Collaboratory for Structural Bioinformatics" intends to bring protein structure data storage into the 21st century. The original Protein Data Bank (PDB) was established in 1971 at Brookhaven National Labs by those who realized that data could no longer be casually passed from one crystallo-grapher to another. The PDB grew from 7 to over 9000 structures. With new methods for rapid and reliable data processing, the collaborating groups are committed to establishing uniform data formats, expanding the query and reporting capabilities, and providing efficient cross links to other data. Many of these points will not be so simple to achieve; Helen noted that the mmCIF dictionary, started in 1980 and anticipated to be a short publication, was finally published in book form last year. There is also a new validation server, that allows one to determine the structure quality (an expanded Procheck). Helen was also eager to have a data base that would contain more of the raw data used to calculate protein structures. To check out the "new" PDB, visit the website http://www.rcsb.org/.

The consortium may soon have a whole lot more structures to compile. Joel Berendzen (Los Alamos National Laboratory) presented the Structural Genomics approach to the determination of protein structures, based on high throughput, parallel and low-cost methods. The goal of most structure determinations is to achieve very high resolution so as to answer specific questions about the function of the protein and its interaction with ligands. Structural genomic goals, to classify proteins and obtain a more complete database of protein structure motifs, can be satisfied with less accurate structures. Of course, the major bottleneck to standardizing crystallization techniques will be high-throughput purification of the diverse proteins. Their group is part of consortium to analyze the recently determined genome sequence of Pyrobaculum aerophilum, a microaerobic archaeon which thrives at 103° C. They have developed a rapid screening program to identify suitably soluble proteins by fusing coding sequences to that of the green fluorescent protein. In their assay a green colony will form only when the fusion protein is soluble (i.e., properly folded). They estimate that structures for 8% of the genome can be determined rapidly if they limit their work to those that can be solubly expressed at 37° C in E. coli (of which about 45% should crystallize). Once they obtain crystals, structure determination can be made more efficient by completely automating the reading of the electron density map and structural refinement.

The first product of the work was a structure for the bacterial initiation factor 5a, which is involved in initiating DNA and translating to mRNA (the human analogue is a cofactor for HIV rev). Although there was no apparent sequence identity, the fold of the C-terminal was identical to that of the E. coli cold shock protein A.

 


Helen Berman

While the current cost at Los Alamos is about $50,000 per structure, which is probably a good deal lower than that for conventionally determined structures, one can expect that full implementation of their high throughput methods will reduce even this figure. Joel pointed out that the whole human genome project still cost less than one atomic submarine. A rational approach to structure design can prove to be the most cost efficient way to define protein targets for new drugs and attack pathogens more specifically.
Due respect for posters
The poster sessions were made especially lively this year thanks to the introduction of The Beckman Coulter Awards to recognize outstanding research by students and post-doctoral fellows. Jonathan W. Neidigh of the U. Washington, Seattle received first prize ($400) for the poster "Designing a 20 residue protein". The author, in cooperation with Matthew Fesinmayer and Niels Andersen, incorporated a "Trp cage" (a hydrophobic cluster of Phe, Trp and Pro side chains) into a peptide that folds in water to the desired structure (characterized by NMR and CD). The second prize ($250) was awarded to Simon Lovell, Duke University, for the poster "Crystallographic map fitting made (a little) easier", co-authors J. Michael Word, Jane S. Richardson and David C. Richardson,which outlined new tools, including the Clash program, that can be used to identify mistakes in crystal structures. The third prize ($150) went to Jin-Quan Luo, UTMB for "The crystal structure of the PexB/Dps A DNA-binding and protecting E.coli protein (co-authors Mark A. White, Deqian Liu, Robert O. Fox). All of the following authors received honorable mention: Mitch Mitchell (The fast, the slow and the metal: a story of the Serratia and I-Ppol Endo nucleases and the role of magnesium in their active sites), Maria Jezewska (Mammalian DNA repair polymerase b binds ssDBA using two different binding modes),Larisa Kosynkina (Automatic structure determination of homologous proteins from NMR spectra), M.R. Ferguson (Tight binding sequences for a SH3 domain selected by phage display), Mingli Yang (Jaz, a novel zinc finger protein having a double stranded RNA binding ability required for nucleolar localization), and Hong Pan (Probing the basis for allosteric behaviour in dihydrofolate reductase using an ensemble-based description of the native state).

Catherine H. Schein

Return to Table of Contents

Click for Next Page