Retro X-ray: Big data boost

Skip to Navigation


  • Published: May 15, 2015
  • Author: David Bradley
  • Channels: X-ray Spectrometry
thumbnail image: Retro X-ray: Big data boost

X-ray boost

New Tool Shrinks Big Data in Biology Studies at SLAC's X-ray Laser (Credit: Stanford University)

Inspired by a 1970s method, Stanford scientists have found a way to go back and improve X-ray results.

There is currently a great deal of interest surrounding the potential of X-ray free electron lasers (XFELs) for the determination of macromolecular crystal structures that are difficult to solve using current synchrotron sources. But, what about those collections of thousands to millions of diffraction images that were obtained with conventional technology prior to XFEL studies, are they to go to waste simply because of limitations to the data processing methods? Of course, not. A team led by Stanford researchers has developed new software that can take on this big data problem for X-ray laser experiments allowing crystallographers to breathe new life into data using far fewer samples and less computing time and allowing new results to be analysed rapidly and old data to be given a fresh analysis.

Plucking data

Understanding biology at the molecular level relies on gargantuan efforts to unravel proteins, nucleic acids and all the other molecules of life. The biological macromolecules are complicated pieces of molecular machinery and their analysis generates huge quantities of data that usually require big research teams to handle the big data.

William Weis, professor and chair of Structural Biology and a member of the Photon Science faculty at SLAC, and Axel Brunger, professor and chair of Molecular and Cellular Physiology at Stanford and also a member of the Photon Science faculty at SLAC, and colleagues have created Prime, a software tool that will be publicly available. The application works by allowing crystallographers to analyse partial, X-ray diffraction patterns and produce a more complete picture of a given structure. When applied to a whole set of data, this can reveal new structural details, the team says.

“We have reduced the required amount of diffraction data that’s needed to get a clearer picture of crystal structures and the time it takes to get a full structure of a biomolecule,” explains Brunger. “This is especially important because LCLS is in such high demand,” he adds, as fewer than 1 in 4 experimental proposals at LCLS can be approved.

Many of the initial experiments carried out on the LCLS consumed millions of samples in the form of microscopic crystallized biomolecules, producing vast quantities of data and demanding super computing power for their analysis. LCLS experiments usually have teams behind them running to dozens of collaborators from research centres across the globe with data scientists being in particular demand for obvious reasons. The team has applied Prime to earlier LCLS diffraction data and so obtained better three-dimensional maps of the electron density in a well-known and well-studied protein myoglobin, for instance. They also generated a higher-quality map of a bacterial enzyme with 100 diffraction images plucked at random from a full data set as a proof of principle. Details are reported in a recent issue of the journal eLife.

Prime focus

Prime, which is short for “post-refinement and merging,” could allow researchers to compress the time taken for many X-ray experiments from several days to just a few hours, if not minutes. This should expand considerably the experimental capacity of LCLS by reducing the data deluge from putatively thousands of terabytes to just gigabytes. Prime could also make many sophisticated experiments less inaccessible to researchers who do not have the special data expertise for analysis and interpretation.

“Some LCLS experiments had required a tremendous amount of sample, and that was a huge limitation,” explains Weis. “It restricted a large number of experiments from even being attempted. With Prime, you don’t need as much redundant data,” Weis adds, which should prove useful for studying membrane proteins that are popular targets for new drug development, for example, but can be challenging to produce in large quantities, not least because membrane proteins are notoriously stubborn to yield good quality crystals, in the first place.

The practice of reanalyzing old data with new techniques has gained momentum across many fields with the increasing supply of big data and computing power. Reanalysis has been particularly popular in the field of particle physics, where experiments can produce massive data sets and virtual “needles in the haystack,” in the form of rare particle events, can be the key to new discoveries.

Prime’s creators were inspired by a data-processing technique for diffraction data developed in the 1970s for synchrotron X-ray sources. It allowed researchers to map the structure of virus samples that were otherwise difficult to study by compiling and analysing a collection of incomplete diffraction data sets from individual crystals. Those partial data sets were compared to other data sets in order to obtain more complete data and refine the results.

“Even though the principal ideas were developed in the 1970s, this particular application required us to rewrite everything,” Brunger points out; because of the unique properties of LCLS. In many biomolecular crystal experiments at LCLS, for example, the crystals are tumbling randomly when hit by X-rays, rather than individually and precisely rotated in the X-rays as they are in a synchrotron experiment. Brunger and Weis said several teams have already expressed interest in reanalysing past diffraction data from LCLS experiments with Prime, which they said could lead to new structural insights.

In addition to the Stanford and SLAC team members, colleagues from the Howard Hughes Medical Institute at Stanford, Lawrence Berkeley National Laboratory and Janelia Research Campus worked on Prime.

Related Links

eLife 2015, online: "Enabling X-ray free electron laser crystallography for challenging biological systems from a limited number of crystals"

Article by David Bradley

The views represented in this article are solely those of the author and do not necessarily represent those of John Wiley and Sons, Ltd.

Follow us on Twitter!

Social Links

Share This Links

Bookmark and Share


Suppliers Selection
Societies Selection

Banner Ad

Click here to see
all job opportunities

Most Viewed

Copyright Information

Interested in separation science? Visit our sister site

Copyright © 2017 John Wiley & Sons, Inc. All Rights Reserved