Part of the O'Reilly News mission is to dig deeper into stories like the Large Hadron Collider (LHC) at CERN and get a more concrete sense of the technology behind the story. Everyone seems to know what the LHC is and that it is going to be switched on later this year, and many of us watched the amazing presentation by Brian Cox at TED 2008. Yet, most of the information you find about the experiment has to be distilled for consumption by the general public. To use an Anthropology term it has been fetishized. Everyone "knows" that the LHC is going to answer age-old mysteries about the structure of matter and space-time, but few have a grasp of the concrete experiments and esoteric science behind the general-audience news stories. When NPR or the network news reports on a particle accelerator it is reporting it as a quasi-religious artifact - it is awe-inspiring "magic". We wanted to try to cover this story from a technology perspective, make it more concrete for a technical audience, and, in doing so, uncover some of the interesting stories other news outlets might have missed. Did you know that the main analysis package used at CERN is freely available, more than 10 years old, and covered by an LGPL license? Do you know how many CPUs make 1 MSI2K? What do you do when your experiment generates 2 GB every ten seconds?
1. The Large Hadron Collider: Computing on a Massive Scale
CERN has already demonstrated an ability to dramatically affect computing - the World Wide Web was created by Tim Berners-Lee (and Robert Cailliau) to support the documentation required for CERN operations. As you'll see in this article, the data processing requirements of the ATLAS and CMS experiments at CERN's LHC push the envelope of modern day computing and force scientists and engineers to create new software and hardward to address the unique requirements at the leading edge of science. There's an availability and network engineering challenge that dwarfs anything you'll ever work on, and there are people working on systems on a scale familiar only to people who happen to work at Google (or the secret caverns of the National Security Agency). There are, no doubt, other unintended consequences of the systems which are about to be turn on as this, the largest scientific experiment in history is turned on.
When the LHC is turned on, it will be more than just a 27-km wide particle accelerator buried 100m deep in Geneva colliding protons. When the LHC is running it will be colliding millions of protons per second, and all of this data will need to be captured and processed by a world-wide grid of computing resources. Correction, by a massively awe-inspiring and mind-numbingly vast array of computing resources grouped into various tiers so large that it is measured in units like Petabytes and MSI2K.
1.1 Tier-0: Geneva
The detectors in the Compact Muon Solenoid (CMS) are sensitive enough to capture the tiniest sub-atomic particles. The detectors will capture anywhere from 2 to 30 proton-proton interactions per event snapshot and they will be generating anywhere from 100 to 200 event snapshots per second. The detector will be creating 2 GB of data every 10 seconds stored in what is called the Tier-0 data center in Geneva. The Tier-0 data center is going to make heavy use of tape, and one slide deck from 2005 states that a Tier-0 data center needs 0.5 PB of disk storage, CPU capacity of 4.6 MSI2K, and a WAN with a capacity greater than 5 Gbps. Once the data is collected it is streamed to seven Tier-1 data centers which take on much of the responsibility for maintaining the data.
What does MSI2K stand for? "Mega SPECint 2000". SPECint 2000 is a standard measure of the power of a CPU. For an in depth explanation see Wikipedia. If we assume a 2 x 3.0 GHz Xeon CPU is 2.3 KSI2K, then it would take about 430 of those CPUs to equal 1 MSI2K. 4.6 MSI2K is going to involve thousands of CPUs dedicated to data extraction and analysis.
1.2 Tier-1: Fermilab (US), RAL (UK), GridKa, others
This raw data must then be analyzed to identify different particles and "jets" (collections of particles associated with interactions). After the raw data is analyzed and reconstructed it is then archived in Tier-1 data centers which are distributed throughout the world (such as Fermilab in Chicago). CMS Twiki Page on Tier-1 Data Centers says that the annual requirements for Tier-1 data center are 2.2 Petabytes of storage (yes, Petabytes) and each Tier-1 data center needs to be able about to handle 200 MB/s from Tier-0 (Geneva) which works out to something like a 2.5 Gbps dedicated line used only for LHC experimental data (some documents suggest that a Tier-1 data center needs > 10 Gbps as it also has to support connections to multiple Tier-2 data centers). A Tier-1 data center also needs to dedicate about 2.5 MSI2K (~1000 high-end CPUs) to the data analysis and extraction computing effort and maintain 1.2 Petabytes of disk storage and 2.8 Petabytes of tape storage. It looks like Tier-1 data centers are going to act as the archive and central collaboration hubs for an even larger number of Tier-2 data centers.
2.5 Gbps across the Atlantic? I can't even get Comcast to come fix my broken cable modem. How's this going to work? There is a project called DataTAG which aims to create an advanced, high-performance data link for research between the EU and the US. Participating organizations are laboratories, universities, and networks like Internet2 which already offer 10 Gbps network connections to research universities and organizations.
1.3 Tier-2 Data Centers
According to a recent newsletter from Fermilab there are over one hundred Tier-2 data centers. When you finally hear about some huge breakthrough in particle physics it will be because someone ran an analysis at a Tier-2 data center that analyzed millions (or billions) of particle interactions and identified some events that fit a theory or a model. A Tier-2 data center needs at least a couple hundred Terabytes, just shy of 1 MSI2K, and something like 500 Mbps sustained to support operations.
1.4 Most Distributed Scientific Computing System Ever
When you hear that they've finally flipped the switch, you'll have an idea as to the heavy computing that is going on every single second. This isn't just a 27-km ring in Geneva smashing protons together, this is the most complex scientific computing system to date. For more information about the CMS Computing Model see CMS Computing Model on the CMS Twiki.
2. What's in this Data?
We've discussed the architecture and organization of the computing resources, what about the data that is being stored and analyzed. For clues about the data format and storage medium we can look to the web to provide us with clues. I found the following talk titled CMS 'AOD' Model Presentation from March 2007. In this talk, Lista disusses the CMS Event Data Model (EDM) which talks about accessing data from a CMS Event. In this presentation, you'll see some technical specifics. On slide four, you'll see the statement Events are written using POOL with ROOT as underlying technology.
It appears that POOL and ROOT are two custom projects for the CMS (Compact Muon Solenoid) project at the LHC. It also looks like many of these projects are open source and freely available.
2.1 Tracking Down POOL and ROOT
A quick Google search for "LHC POOL ROOT" will bring up various references one of which is a paper published in IEEE Transactions on Nuclear Science. Typical of most LHC-related papers in peer reviewed journals, this paper has greater than ten authors. The Chytracek, et al. paper is entitled "POOL Development Status and Production Experience", and this three-year old paper has the following abstract:
The pool of persistent objects for LHC (POOL) project, part of the large Hadron collider (LHC) computing grid (LCG), is now entering its third year of active development. POOL provides the baseline persistency framework for three LHC experiments. It is based on a strict component model, insulating experiment software from a variety of storage technologies. This paper gives a brief overview of the POOL architecture, its main design principles and the experience gained with integration into LHC experiment frameworks. It also presents recent developments in the POOL works areas of relational database abstraction and object storage into relational database management systems (RDBMS) systems.
From the POOL project's site, you can retrieve an earlier version of the same paper. Which contains the following excerpt:
The main development efforts of the POOL team are currently focused towards the definition and implementation of a technologically neutral mechanism for accessing relational databases. One of the ultimate aims of this new development line is the extension of the object storage capabilities of POOL in order to be able to use relational database technologies, complementing the usage of the existing object streaming technology.
2.2 POOL: ORM for Experimental Data and Analysis
POOL project's homepage at CERN states that POOL stands for "Pool of Persistent Objects for LHC", and it appears to be a way to serialize/deserialize objects to various relational databases. It appears to be analogous to something like Hibernate or ActiveRecord, and it has been integrated with a few of the experiments at CERN including CMS and ATLAS two major detectors involved in the LHC. Pool is written in C++.
Browsing the CVS repository for POOL you'll see that there are "catalogs" for MySQL, Oracle, PyFile, XML. There is some example code: TinyEventWriter.cpp looks like a contrived example, but it does seem to be something of a model for systems that need to record the output of a calorimeter.
From the CMS Computing Model which was written in 2004, and from some PDF slides from ROOT project's homepage at CERN says that ROOT is "An Object-Oriented Data Analysis Framework", and I don't think you could get any more vauge and non-specific than a description like this. It is covered under an LGPL v2.1 license. If you take a look at the Introduction from the ROOT user's guide, you will find the following (direct quote) "ROOT is an object-oriented framework aimed at solving the data analysis challenges of high-energy physics.". You will also find this introduction of ROOT as a decade old piece of open-source software designed to address the challenges of Terabyte-scale data analysis developed in a "Bazaar style".
In the mid 1990's, René Brun and Fons Rademakers had many years of experience developing interactive tools and simulation packages. They had lead successful projects such as PAW, PIAF, and GEANT, and they knew the twenty-year-old FORTRAN libraries had reached their limits. Although still very popular, these tools could not scale up to the challenges offered by the Large Hadron Collider, where the data is a few orders of magnitude larger than anything seen before.
...ROOT was developed in the context of the NA49 experiment at CERN. NA49 has generated an impressive
amount of data, around 10 Terabytes per run. This rate provided the ideal environment to develop and test the next generation data analysis...
One cannot mention ROOT without mentioning CINT, its C++ interpreter. CINT was created by Masa Goto in
Japan. It is an independent product, which ROOT is using for the command line and script processor.
ROOT was, and still is, developed in the "Bazaar style", a term from the book "The Cathedral and the Bazaar" by Eric S. Raymond. It means a liberal, informal development style that heavily relies on the diverse and deep talent of the user community. The result is that physicists developed ROOT for themselves; this made it specific, appropriate, useful, and over time refined and very powerful. The development of ROOT is a continuous conversation between users and developers with the line between the two blurring at times and the users becoming co-developers.
When it comes to storing and mining large amount of data, physics plows the way with its Terabytes, but other fields and industry follow close behind as they acquiring more and more data over time. They are ready to use the true and tested technologies physics has invented. In this way, other fields and industries have found ROOT useful and they have started to use it also.
In summary, ROOT appears to be an analysis framework specific to High Energy Physics which allows physicists to write simple macros to do things like Draw a Feynman Diagram (What's a Feynman diagram? See Wikipedia.) It can also be used to write an analysis program that will "search for invisible Higgs boson". The syntax of a ROOT macro is C++, and it appears that you can run ROOT as an interactive shell or you can execute macros.
2.4 The CMS Wiki: Data Formats
The CMS Twiki instance has a wealth of information about the experiment and links to every possible piece of information you'd want to know about this experiment. There is a page that contains information and an overall description of the different formats of data available: Data Formats Work Book. From this page, the DAQ-RAW data is processed into RECO data and AOD data which contain data about specifc particles. From my brief interview with Brian Cox (see next article), there are a number of analysis packages written in both C++ and Fortran which need to be applied to this RAW and RECO data to identify particles and jets that result from the interaction of particles.
Once the data has been analyzed, it is archived and then researchers use the resources of the massive, world-wide computing grid associated with the LHC to identify trends and candidate interactions.
While the majority of people reading this article are not going to go out and start running KTJet++ on a multi-Terabyte collection of experimental data, you do have a better general sense of the amount of computing power that sits behind an experiment like the Large Hadron Collider. I've come away with the impression that, while the technology behind the detector and the collider itself is important, computing plays a central role to the success of the experiment. In the interview that follows this article, Brian Cox compares the LHC to a CPU wth a clock frequency of 40 MHz. Given what we've learned in this article, I'd call the collider the CPU of a massively collaborative, world-wide grid computing infrastructure.
Tim: I'm going to go on record right now and predict that this thing is going to tell us that we've known the answer all along. 42.
PHOTO CREDIT: The picture of the Compact Muon Solenoid is a picture from Flickr covered under the Creative Commons Attribution, No Derivative 2.0 license. It was taken by the user cyclequark, a very talented photographer and owner of a very interesting technology blog.