W0343

Data-driven Analysis and Optimization of Protein Crystal Screens. Matthew S. Kimber, Francois Vallee, Simon Houston, Alexander Necakov, Dinesh Christendat, Alexei Savchenko, Cheryl H. Arrowsmith, Masoud Vedadi, Mark Gerstein, Aled M. Edwards, Affinium Pharmaceuticals, 12th Floor, North Tower, 100 University Ave., Toronto, ON M5J 1V6 CANADA.

Protein crystallization is a major bottleneck in X-ray crystallography. Because the principles that govern protein crystallization are too poorly understood to allow them to be used in a strongly predictive sense, the most common crystallization strategy entails screening a wide variety of solution conditions to identify the small number of solution conditions that will support crystal nucleation and growth. We tested the hypothesis that more efficient crystallization strategies could be formulated by extracting useful patterns and correlations from the large datasets of crystallization trials created in structural proteomics projects. An extensive database of crystallization behavior (representing 755 different proteins purified under uniform conditions and crystallized under the widely used Jancarik and Kim screen) was populated and analyzed. 45 % of the proteins formed crystals. Data mining identified the conditions that crystallize the most proteins, revealed that many conditions are highly correlated in their behavior and showed that the crystallization success rate is markedly dependent on the organism from which proteins derive. Of the proteins that crystallized in a 48 condition experiment, 60 % could be crystallized in as few as 6 conditions and 94 % in 24 conditions. Consideration of the full range of information coming from crystal screening trials allows one to design screens that are maximally productive while consuming minimal resources, and also suggests further useful conditions for extending existing screens.