Выбрать главу

TIP Setting up a mirror without synchronisation

RAID-1 volumes are often created to be used as a new disk, often considered blank. The actual initial contents of the disk is therefore not very relevant, since one only needs to know that the data written after the creation of the volume, in particular the filesystem, can be accessed later.

One might therefore wonder about the point of synchronising both disks at creation time. Why care whether the contents are identical on zones of the volume that we know will only be read after we have written to them?

Fortunately, this synchronisation phase can be avoided by passing the --assume-clean option to mdadm. However, this option can lead to surprises in cases where the initial data will be read (for instance if a filesystem is already present on the physical disks), which is why it isn't enabled by default.

Now let's see what happens when one of the elements of the RAID-1 array fails. mdadm, in particular its --fail option, allows simulating such a disk failure:

mdadm /dev/md1 --fail /dev/sdh

mdadm: set /dev/sdh faulty in /dev/md1

mdadm --detail /dev/md1

/dev/md1:

[...]

    Update Time : Thu Sep 30 15:45:50 2010

          State : active, degraded

 Active Devices : 1

Working Devices : 1

 Failed Devices : 1

  Spare Devices : 0

           Name : squeeze:1  (local to host squeeze)

           UUID : 20a8419b:41612750:b9171cfe:00d9a432

         Events : 35

    Number   Major   Minor   RaidDevice State

       0       8       98        0      active sync   /dev/sdg2

       1       0        0        1      removed

       2       8      112        -      faulty spare  /dev/sdh

The contents of the volume are still accessible (and, if it's mounted, the applications don't notice a thing), but the data safety isn't assured anymore: should the sdg disk fail in turn, the data would be lost. We want to avoid that risk, so we'll replace the failed disk with a new one, sdi:

mdadm /dev/md1 --add /dev/sdi

mdadm: added /dev/sdi

mdadm --detail /dev/md1

/dev/md1:

[...]

   Raid Devices : 2

  Total Devices : 3

    Persistence : Superblock is persistent

    Update Time : Thu Sep 30 15:52:29 2010

          State : active, degraded, recovering

 Active Devices : 1

Working Devices : 2

 Failed Devices : 1

  Spare Devices : 1

 Rebuild Status : 45% complete

           Name : squeeze:1  (local to host squeeze)

           UUID : 20a8419b:41612750:b9171cfe:00d9a432

         Events : 53

    Number   Major   Minor   RaidDevice State

       0       8       98        0      active sync   /dev/sdg2

       3       8      128        1      spare rebuilding   /dev/sdi

       2       8      112        -      faulty spare   /dev/sdh

[...]

[...]

mdadm --detail /dev/md1

/dev/md1:

[...]

    Update Time : Thu Sep 30 15:52:35 2010

          State : active

 Active Devices : 2

Working Devices : 2

 Failed Devices : 1

  Spare Devices : 0

           Name : squeeze:1  (local to host squeeze)

           UUID : 20a8419b:41612750:b9171cfe:00d9a432

         Events : 71

    Number   Major   Minor   RaidDevice State

       0       8       98        0      active sync   /dev/sdg2

       1       8      128        1      active sync   /dev/sdi

       2       8      112        -      faulty spare   /dev/sdh

Here again, the kernel automatically triggers a reconstruction phase during which the volume, although still accessible, is in a degraded mode. Once the reconstruction is over, the RAID array is back to a normal state. One can then tell the system that the sdh disk is about to be removed from the array, so as to end up with a classical RAID mirror on two disks:

mdadm /dev/md1 --remove /dev/sdh

mdadm: hot removed /dev/sdh from /dev/md1

mdadm --detail /dev/md1

/dev/md1:

[...]

    Number   Major   Minor   RaidDevice State

       0       8       98        0      active sync   /dev/sdg2

       1       8      128        1      active sync   /dev/sdi

From then on, the drive can be physically removed when the server is next switched off, or even hot-removed when the hardware configuration allows hot-swap. Such configurations include some SCSI controllers, most SATA disks, and external drives operating on USB or Firewire.

12.1.1.3. Backing up the Configuration

Most of the meta-data concerning RAID volumes are saved directly on the disks that make up these arrays, so that the kernel can detect the arrays and their components and assemble them automatically when the system starts up. However, backing up this configuration is encouraged, because this detection isn't fail-proof, and it's only expected that it will fail precisely in sensitive circumstances. In our example, if the sdh disk failure had been real (instead of simulated) and the system had been restarted without removing this sdh disk, this disk could start working again due to having been probed during the reboot. The kernel would then have three physical elements, each claiming to contain half of the same RAID volume. Another source of confusion can come when RAID volumes from two servers are consolidated onto one server only. If these arrays were running normally before the disks were moved, the kernel would be able to detect and reassemble the pairs properly; but if the moved disks had been aggregated into an md1 on the old server, and the new server already has an md1, one of the mirrors would be renamed.