EventIndex20131212

From ExpPhys
Jump to: navigation, search

ABSTRACT


The ATLAS experiment at the LHC collected huge amounts of data during
its first data-taking run. These data are mostly records of particle
interactions and a response produced in the detectors - an “Event”, both
obtained in real collisions and in simulated software. Events are being
processed in various stages on the Grid distributed computing network,
and the results of this “data production” are stored in computing centers
around the world. These data, corresponding to billions of events, have
to be cataloged according to different points of view to meet multiple
user cases and search criteria by various physics analysis groups.
The currently used catalog, which uses the ORACLE database, has difficulties
providing the necessary functionality and cannot be scaled to the
dramatically increased amount of data in the upcoming Run-II at LHC. Thus,
the new EventIndex system is being created, based on modern NoSQL
technologies, that allow natural scaling and provide good performance,
robustness and ease of use. It will be implemented in the Hadoop
environment, which originates from Google development and ins currently being
used by companies dealing with the largest data volumes, such as Yahoo and
Amazon.
       Event index will contain just the basic information necessary to select
events for analysis purposes: Event id, online trigger information and
references to all files containing this event. An event record will be
created when recording RAW data from the detector or HITS from
Monte-Carlo generation, and new information will be added with every
processing cycle, so the whole history of the event can be tracked.
The system has to be ready in mid-2014 so that it can pass tests in the
end of the next year and will be ready for the beginning of Run-II in 2015.
     The EventIndex project is divided into 4 major tasks: core
architecture, data collection and storage, query services, and
functional tests and operation. Currently the third task – query
services mostly requires manpower. It does not require a deep physics
knowledge and that’s why it can be considered a good project for
informatics students, for example.