W0095
Automatic Classification of Protein Crystallization Screens
on 1536-well Plates. I. Jurisica1, C. Cumbaa1, A.
Lauricella2, N. Fehrman2, C.Veatch2, R.
Collins2, J. Luft2, G. DeTitta2,
1Ontario Cancer Inst./PMH, 610 University Ave., Toronto, ON M5S2M9,
2Hauptman-Woodward Institute, 73 High St., Buffalo, NY
14203-1196.
Utilizing high-throughput protein crystallization screening
will help to eliminate protein crystallization as a bottleneck in modern
structural biology. The challenge is systematic and automated computational
analysis of the resulting data deluge.
Our technique for automatic classification of microbatch
protein crystallization experiments on 1536-well plates addresses the analysis
problems introduced at the sub-microlitre scale, including non-uniform lighting
and irregular droplet boundaries.
Image segmentation is applied to separate the droplet from the
well, using a loopy Bayes net with a two-layered grid topology. Resulting images
are analyzed to extract a 23-element feature vector from each droplet contents
using the Radon transform for straight edge features and a set of correlation
filters for microcrystalline features. Image classification is performed using a
linear discriminant analysis on image feature vectors. Currently, the system
automatically classifies images into crystal, clear and precipitates
categories.
We compared the results of our automatic protein
crystallization image classification with those of a human expert on 18 plates
(27648 images). Using the human-labeled images as ground truth, our method
classifies images with 89% accuracy and a ROC score of 0.875. This result
compares well with the experimental repeatability rate assessed at
87%.
There are several profound findings from this validated
analysis. First, the accuracy is dependent on the number of crystals on a given
plate. Second, there is an interesting pattern to false positives and negatives.
False positives are drops with particles that look like microcrystals, or
wrinkles in the skin that resemble crystal edges. False negatives are crystals
too fine for detection or crystals without straight edges.
A characterization of these misclassifications suggests
directions for improving the method. Important new extension will integrate data
mining of historical information in order to increase specificity, while keeping
sensitivity high