High energy physics experiment may benefit car, aircraft navigation
Ten or 15 years from now, if auto manufacturers begin offering guidance systems that allow your car to drive you automatically to and from work, you may have an ambitious new computer science project at the Fermi National Accelerator Laboratory in Batavia, Ill., to thank. A research group of physicists, computer scientists and electrical engineers from Vanderbilt, Syracuse University, the University of Pittsburgh, Fermilab and the University of Illinois, Urbana Champaign has just received a grant of $4.98 million to develop an advanced computer system capable of scanning terabytes (thousands of billions of bytes) of information produced by the detector in a new high energy physics experiment, called BTeV. Not only must the system identify the exceedingly rare interactions that the physicists are interested in, but it must also be exceptionally reliable and easy to maintain and upgrade.
The five-year grant is part of a $156 million program in information technology research announced by the National Science Foundation. According to NSF, the purpose of the awards is "to preserve America's position as the world leader of computer science and its applications." Projects were specifically selected that agency managers felt could have commercial application. Possible commercial applications for a system with the aforementioned characteristics include autonomous vehicle navigation, air traffic control systems, global weather monitoring and disaster early warning systems, satellite-based surveillance, highly available Internet services, computer vision systems, and turbine engine and rocket motor monitoring. BTeV will use Fermilab's Tevatron, the world's highest-energy particle accelerator and collider, to produce collisions between counter-rotating beams of protons and antiprotons traveling at nearly the speed of light. When two of these particles collide they produce a shower of subatomic particles. Among these daughter particles are a small number of B-particles that the physicists want to study. They are interested in B-particles because they display a subtle property, called charge-parity violation, that may explain why the visible universe is filled with normal matter but appears to contain no antimatter, its electrical mirror image, despite the fact that current theory predicts both matter and antimatter should have been created in equal amounts by the Big Bang. During the 10-year experiment, the Tevatron will be producing about 15 million collisions per second. Scientists are building an elaborate array of detectors that surround the interaction point. The array will allow them to determine the trajectories, energies and identities of the daughter particles that are produced in each interaction: information that allows them to reconstruct what happened during the subatomic collision. "Unfortunately, out of these billions upon billions of interactions, only an exceedingly small fraction contain interesting physics," said Paul Sheldon. He is an associate professor of physics at Vanderbilt and head of the project to solve the technical problems presented by the "trigger system" that will be responsible for automatically identifying and recording potentially interesting subatomic events. In previous particle physics experiments, physicists have used relatively crude "triggers," but the volume and rate of data that the BTeV detector will produce rendered previous approaches unworkable. If the scientists were to try to record all the data coming from the detector, it would fill 15 of the largest commercially available hard drives every second. Because of the intimidating volume of information involved, the physicists turned to computer scientists and electrical engineers to help them design the system. "High energy physicists believe that they can solve any problem themselves, so this is one of the few times that we have turned to colleagues in computer science and electrical engineering for help," Sheldon said. According to Ted Bapty, research assistant professor at Vanderbilt's Institute for Software Integrated Systems who is heading up the design and modeling aspects of the effort, "the sheer computational horsepower that the physicists require makes this a great engineering problem." The detection system will be made up of three layers of hardware: millions of individual sensors, connected to a few thousand specialized processors called DSPs similar to those used in cell phones, connected in turn to a parallel supercomputer constructed out of 5,000 to 10,000 commodity PCs. By comparison, ISIS recently completed an adaptive computing project for the Defense Advanced Research Project Agency that involved creating a system with 10 to 100 processors. "To create this trigger system," said Bapty, "we must come up with new solutions to a number of general problems in the fields of computer science and engineering, and we must implement these theoretical approaches in tools that can be applied to building real systems." First, the system must be capable of processing the flood of detector data as fast as it comes in. This translates into enough processing power to handle a trillion instructions per second. The only computers capable of such throughput do so by distributing the data to banks of processors that operate in parallel. The trigger system software also must tie together a mixture of off-the-shelf processors and specialty chips that are connected by a high-speed network, which adds an additional degree of complexity. The physicists also want a system that is ultra-reliable. They are spending millions of dollars to investigate B-particles, and they don't want to miss a single one of the rare events that they are looking for because the detection system is malfunctioning. The traditional approach to getting this degree of reliability from a computer system is triple redundancy. In applications like the control of the space shuttle, where reliability is paramount, designers have put in three identical computer systems that process the same data and then "vote" on what action should be taken. The system acts when two or more of the computers agree. If one of the computers begins disagreeing with the others too often, it is considered defective and is replaced. But this approach was far too expensive for the physicists' budget. Instead, the system designers plan to include about 25 percent excess capacity distributed throughout the system. Under normal conditions, each processor will be operating at only 75 percent of its full capacity. But they will also make the system dynamically re-configurable. When a bank of processors or a network connection fails, the system will automatically sense that failure and reroute the data so that the processing continues. Because of the size and complexity of the system, physicists expect failures and faults to occur daily. Due to the high data rate, time is of the essence when responding to problems, so the system must be "self aware" enough to sense such failures and respond as quickly and with as little human intervention as possible. In technical terms, the system must be fault tolerant and fault adaptive. Another requirement is that the system be easily upgradeable. If, after the experiment has been running for five years, the physicists want to replace some of the PCs with newer and more powerful models, they should be able to do so without rewriting thousands of lines of computer code. Similarly, if they want to swap one kind of detector panel for another, they should be able to do so with relative ease. In technical terms, the system must be evolvable. "Currently, no tools exist for building large, reliable, high-performance systems," said Bapty. "So the first thing we must do is create the proper tools. It is an extremely challenging and interesting problem and, once we have done it, the tools can be applied to many scientific and commercial problems." If they can demonstrate the kind of ultra-high speed and ultra-high levels of performance and reliability required to count and analyze billions upon billions of high energy particles, then autonomous control systems for automobiles is just one of a number of possible commercial applications. Such computer systems might be used for everything from improving the speed and reliability of Internet traffic at high-demand sites to increasing the productivity of factories. They also might find application in monitoring systems for critical resources, such as water supplies, the researchers speculate. Besides Bapty and Sheldon, the other principal investigators on the grant are Jae Oh, assistant professor of computer science at Syracuse University; Ruth Pordes, associate head of the computing division at Fermilab; and Mike Haney, a research engineer in the department of physics at the University of Illinois, Urbana Champaign.
Vanderbilt
Homepage
| Media Relations | News
Service |