Wednesday, June 19, 2013

SLAC National Accelerator Project creates big data (Why is that not surprising?)

I have visited Stanford Linear Accelerator Center (SLAC) several times over the last few years. But, never once did I realize the amount of data that this famous facility generates and what they do with it. So, I could not control my excitement when I saw the following post. Here is a good starting point to the amount of data currently with SLAC from the same post.
(LCLS stands for SLAC's Linac Coherent Light Source X-ray laser)

Data at SLAC Infographic

Where does all this data come from?

"With its detectors collecting information on atomic- and molecular-scale phenomena measured in quadrillionths of a second, LCLS stores data at a rate and scale comparable to experiments at the world's most powerful particle collider, the Large Hadron Collider in Europe."

How is the data consumed

"Some teams choose to store and analyze their data at SLAC, while others transfer data over high-speed scientific computing networks. Larger teams may bring their own data experts to LCLS for experiments.

There is a push to improve the user interface to make LCLS data tools more accessible to scientists, offer more real-time data during experiments, train staff to work more closely with users on learning the data systems and continue to work toward common data standards."

Upcoming Talk

The post does not extensively cover the technology used to store and manage the data. But the good news is that more details will be presented at National User Facility Organization annual meeting that will be held from June 19-21 at Lawrence Berkeley National Laboratory. SLAC's Amedeo Perazzo is scheduled to present a talk on "Data Management at the LCLS" at 1 p.m. June 20. I am hoping that he will talk about the technology aspects.