Redundant Array Of Independent Disks (RAID) - NetwaxLab

Breaking

Facebook Popup

BANNER 728X90

Tuesday, November 11, 2014

Redundant Array Of Independent Disks (RAID)

RAID (originally redundant array of inexpensive disks; now commonly redundant array of independent disks) is a data storage virtualization technology that combines multiple small, inexpensive disk drives into an array of disk drives which yields performance exceeding that of a Single Large Expensive Drive (SLED). Additionally, this array of drives appears to the computer as a single logical storage unit or drive.

In general, RAID implementations also improve the I/O performance of storage systems by storing data across multiple HDDs. RAID controllers bring together several physical hard disks to form virtual hard disks that are faster and more fault-tolerant than individual physical hard disks.
In most disk subsystems there is a controller between the connection ports and the hard disks. The controller can significantly increase the data availability and data access performance with the aid of a so-called RAID procedure.

The need for RAID can be summarized in two points given below. The two keywords are Redundant and Array:

  • An array of multiple disks accessed in parallel will give greater throughput than a single disk.
  • Redundant data on multiple disks provides fault tolerance.
With a single hard disk, you cannot protect yourself against the costs of a disk failure.

With multiple disks and a suitable redundancy scheme, your system can stay up and running when a disk fails.

HISTORY


The term "RAID" was invented by David Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987. In their June 1988 paper "A Case for Redundant Arrays of Inexpensive Disks (RAID)", presented at the SIGMOD conference, they argued that the top performing mainframe disk drives of the time could be beaten on performance by an array of the inexpensive drives that had been developed for the growing personal computer market. Although failures would rise in proportion to the number of drives, by configuring for redundancy, the reliability of an array could far exceed that of any large single drive.

Although not yet using that terminology, each of the five levels of RAID named in the paper were well established in the art prior to the paper's publications, for example:

  • Around 1983, DEC began shipping subsystem mirrored RA8X disk drives (now known as RAID 1) as part of its HSC50 subsystem.
  • Around 1988, the Thinking Machines DataVault used error correction codes (now known as RAID 2) in an array of disk drives. A similar approach was used in the 1970s on the IBM 3330.
  • In 1977, Norman Ken Ouchi at IBM filed a patent disclosing what was subsequently named RAID 4.
  • In 1986, Clark et al. at IBM filed a patent disclosing what was subsequently named RAID 5.
  • Industry RAID manufacturers later tended to interpret the acronym as standing for "redundant array of independent disks".

Possible Approaches to RAID


There are two types of RAID approaches, hardware and software:

1. Software RAID


Software RAID uses host-based software to provide RAID functions. It is implemented at the operating-system level and does not use a dedicated hardware controller to manage the RAID array.

  • The MD driver in the Linux kernel is an example of a RAID solution that is completely hardware independent.
  • The Linux MD driver supports currently RAID levels 0/1/4/5 + linear mode.
  • Under Solaris you have the Solstice DiskSuite and VERITAS Volume Manager which offer RAID-0/1 and 5.
  • Adaptecs AAA-RAID controllers are another example, they have no RAID functionality whatsoever on the controller, they depend on external drivers to provide all external RAID functionality.
  • They are basically only multiple single AHA2940 controllers which have been integrated on one card. Linux detects them as AHA2940 and treats them accordingly.
  • Every OS needs its own special driver for this type of RAID solution, this is error prone and not very compatible.

2. Hardware RAID


In hardware RAID implementations, a specialized hardware controller is implemented either on the host or on the array. These implementations vary in the way the storage array interacts with the host.

Controller card RAID is host-based hardware RAID implementation in which a specialized RAID controller is installed in the host and HDDs are connected to it.

The hardware based system manages the RAID subsystem independently from the host and presents to the host only a single disk per RAID array. This way the host doesn't have to be aware of the RAID subsystems(s).

  • The controller based hardware solution

DPT's SCSI controllers are a good example for a controller based RAID solution.

The intelligent contoller manages the RAID subsystem independently from the host. The advantage over an external SCSI---SCSI RAID subsystem is that the contoller is able to span the RAID subsystem over multiple SCSI channels and and by this remove the limiting factor external RAID solutions have: The transfer rate over the SCSI bus.
  • The external hardware solution (SCSI---SCSI RAID)

An external RAID box moves all RAID handling "intelligence" into a contoller that is sitting in the external disk subsystem. The whole subsystem is connected to the host via a normal SCSI controller and apears to the host as a single or multiple disks.

This solution has drawbacks compared to the contoller based solution: The single SCSI channel used in this solution creates a bottleneck.

Newer technologies like Fiber Channel can ease this problem, especially if they allow to trunk multiple channels into a Storage Area Network.

4 SCSI drives can already completely flood a parallel SCSI bus, since the average transfer size is around 4KB and the command transfer overhead - which is even in Ultra SCSI still done asynchronously - takes most of the bus time.

Hardware vs. Software RAID

A RAID Controller

Just like any other application, software-based arrays occupy host system memory, consume

CPU cycles and are operating system dependent. By contending with other applications that are running concurrently for host CPU cycles and memory, software-based arrays degrade overall server performance. Also, unlike hardware-based arrays, the performance of a software-based array is directly dependent on server CPU performance and load.

Except for the array functionality, hardware-based RAID schemes have very little in common with software-based implementations. Since the host CPU can execute user applications while the array adapter's processor simultaneously executes the array functions, the result is true hardware multi-tasking. Hardware arrays also do not occupy any host system memory, nor are they operating system dependent.
RAID Controller connected to physical disks

Hardware arrays are also highly fault tolerant. Since the array logic is based in hardware, software is NOT required to boot. Some software arrays, however, will fail to boot if the boot drive in the array fails. For example, an array implemented in software can only be functional when the array software has been read from the disks and is memory-resident. What happens if the server can't load the array software because the disk that contains the fault tolerant software has failed? Software-based implementations commonly require a separate boot drive, which is NOT included in the array.

 

Standard RAIDs Levels

  • RAID 0
  • RAID 1
  • RAID 2
  • RAID 3
  • RAID 4
  • RAID 5
  • RAID 6
for more Deatils on RAID standards click here

No comments:

Post a Comment