IceCube
IceCube: Cracking the Cosmic Code
Data Storage
Server Room In the Dark
Server Room In the Dark

Data Storage

The IceCube project generates a large amount of experimental data even in the early stages of completion, and as more detector strings are deployed, the production increases dramatically. Simulated data is generated in parallel for testing of data analysis software and can be more extensive than the experimental data. The large total quantity makes storage, access, and backup challenging.

Disk Storage

Two models of disk arrays form the primary storage for the project, the NexSan SATABeast and the Apple Xserve RAID.

Each SATABeast contains 42 500GB SATA hard drives, divided into eight RAID 5 arrays with five disks each, and two drives available as spares to be automatically used in case any disk fails. Total capacity for each disk enclosure (4U in height) configured this way is 14.5 TB (14500 GB).

RAID Arrays
RAID Arrays

Each Xserve RAID contains 14 500GB ATA hard drives, divided into two RAID 5 arrays with seven disks each. Total capacity for each enclosure (3U in height) is 5.5 TB (5500 GB) .

The cost per terabyte is very similar between the two models, with the Xserve RAID coming in slightly cheaper but taking up more rack space.

As of January 2007, the data center has five SATABeasts and ten XServe RAIDs.

Disk Servers

The disk storage is managed by HP Proliant DL 385 servers (Two AMD 280 Opteron 2.4GHz dual core processors each, 9GB RAM) running RedHat Enterprise Linux. The servers operate in pairs, with each able to access its partners storage in addition to its own in case of failure.

Storage and Data Networks

The servers and disk storage connect via a storage network, which allows multiple servers to access the same storage in case of hardware failure or changing need.

As of January 2007, the storage network consists of two Cisco MDS 9120 switches with 20 2Gb links and one Cisco MDS 9140 switch with 40 2Gb links.

Cluster and Tape Library
Cluster and Tape Library

Each server also has three gigabit ethernet jacks for three separate conventional networks. One public network for communication with client computers, one private network for communication between the data servers, and one private network for communication with a computation cluster (which filters the experimental data, generates simulated data, and produces of the load on the data storage).

Filesystem

To hide the complexities of the servers and disk storage from the end users and client computers, we use IBRIX clustering software to combine the multiple servers and storage into two filesystems (one for experimental data, one for simulated data). An additional DL 385 server acts as a manager, not accessing the data directly but monitoring and controlling the individual data servers.

Backup

While the disk storage does allow for some disk failure without data loss, the data is also backed up to tape for protection against other forms of loss (software or human error changing or deleting data, offsite rotation of data for disaster recovery).

One DL 385 server has read-only access to the storage network and backs up data to a SpectraLogic T950 tape library using the Time Navigator software by Atempo. As of October 2006, the library is configured with four SAIT tape drives and 250 tape capacity, though it can be expanded to sixteen drives and 950 tapes in its current enclosure and beyond that with additional enclosures (each enclosure approximately the size of a 42U rack).

Currently the library backs up 3TB a day to two copies (one kept in the library, one rotated offsite) which means a full backup of the data takes about a month. However, the backup software monitors and backs up any changed data on a nightly basis (which luckily is a small fraction of the total).