W0003

Automated Error Detection and Error Correction for Protein Crystallography. John Badger, Jorg Hendle, Chuck Kissinger, Structural GenomiX, Inc., 10505 Roselle St., San Diego CA 92121.

Tests in which we have applied automated building methods to a large set of experimentally phased maps show that with medium to high resolution data it is usually possible to obtain 75 - 90% of the final structure prior to interactive model-building. Applying protein crystallography to drug discovery cycles, we are rapidly generating large numbers of data sets for closely related co-crystal structures, which differ in the bound ligand and the exact conformation of surrounding residues. These results shift the emphasis and effort in protein structure determination to a 'finalization process' where models are completed and validated.

Our previous work on reliable error detection (Badger & Hendle, Acta Cryst. D58, 284-291, 2002) has been updated and extended as a result of quality control activities on new structures entering our crystallographic database. Software for automatically refitting incorrect portions of the protein model has been developed to facilitate rapid structure completion. Trials with very inaccurate structures suggest that this methodology also increases the radius of convergence of standard refinement procedures, overtaking approaches based on simulated annealing in some cases.

Errors and inaccuracies in bound ligand conformations may be minimized using automated density fitting procedures that select the best conformer from a set of low energy conformations. Examples show that by incorporating anomalously scattering atoms in ligands it is often possible to definitively locate and orient small molecules in protein crystals in cases where interpretations based on normal scattering are ambiguous. Reliable generation of refinement restraints is accomplished by atom typing using fully hydrogenated models built from SMILES representations.