NOTE Why three RAID-1 volumes?
We could have set up one RAID-1 volume only, to serve as a physical volume for vg_raid. Why create three of them, then?
The rationale for the first split (md0 vs. the others) is about data safety: data written to both elements of a RAID-1 mirror are exactly the same, and it is therefore possible to bypass the RAID layer and mount one of the disks directly. In case of a kernel bug, for instance, or if the LVM metadata become corrupted, it is still possible to boot a minimal system to access critical data such as the layout of disks in the RAID and LVM volumes; the metadata can then be reconstructed and the files can be accessed again, so that the system can be brought back to its nominal state.
The rationale for the second split (md1 vs. md2) is less clear-cut, and more related to acknowledging that the future is uncertain. When the workstation is first assembled, the exact storage requirements are not necessarily known with perfect precision; they can also evolve over time. In our case, we can't know in advance the actual storage space requirements for video rushes and complete video clips. If one particular clip needs a very large amount of rushes, and the VG dedicated to redundant data is less than halfway full, we can re-use some of its unneeded space. We can remove one of the physical volumes, say md2 from vg_raid and either assign it to vg_bulk directly (if the expected duration of the operation is short enough that we can live with the temporary drop in performance), or undo the RAID setup on md2 and integrate its components sda6 and sdc6 into the bulk VG (which grows by 200 GB instead of 100 GB); the lv_rushes logical volume can then be grown according to requirements.
12.2. Virtualization
Virtualization is one of the most major advances in the recent years of computing. The term covers various abstractions and techniques simulating virtual computers with a variable degree of independence on the actual hardware. One physical server can then host several systems working at the same time and in isolation. Applications are many, and often derive from this isolation: test environments with varying configurations for instance, or separation of hosted services across different virtual machines for security.
There are multiple virtualization solutions, each with its own pros and cons. This book will focus on Xen, LXC, and KVM, but other noteworthy implementations include the following:
QEMU is a software emulator for a full computer; performances are far from the speed one could achieve running natively, but this allows running unmodified or experimental operating systems on the emulated hardware. It also allows emulating a different hardware architecture: for instance, an i386 system can emulate an arm computer. QEMU is free software.
→ http://www.qemu.org/
Bochs is another free virtual machine, but it only emulates the i386 architecture.
VMWare is a proprietary virtual machine; being one of the oldest out there, it's also one of the most widely-known. It works on principles similar to QEMU. VMWare proposes advanced features such as snapshotting a running virtual machine.
→ http://www.vmware.com/
VirtualBox is a virtual machine that is mostly free software (although some extra components are under a proprietary license). Although younger than VMWare and restricted to the i386 and amd64 architectures, it shows promise; it already allows snapshotting, for instance. VirtualBox has been part of Debian since Lenny.
→ http://www.virtualbox.org/
12.2.1. Xen
Xen is a “paravirtualization” solution. It introduces a thin abstraction layer, called a “hypervisor”, between the hardware and the upper systems; this acts as a referee that controls access to hardware from the virtual machines. However, it only handles a few of the instructions, the rest is directly executed by the hardware on behalf of the systems. The main advantage is that performances are not degraded, and systems run close to native speed; the drawback is that the kernels of the operating systems one wishes to use on a Xen hypervisor need to be adapted to run on Xen.
Let's spend some time on terms. The hypervisor is the lowest layer, that runs directly on the hardware, even below the kernel. This hypervisor can split the rest of the software across several domains, which can be seen as so many virtual machines. One of these domains (the first one that gets started) is known as dom0, and has a special role, since only this domain can control the hypervisor and the execution of other domains. These other domains are known as domU. In other words, and from a user point of view, the dom0 matches the “host” of other virtualizaton systems, while a domU can be seen as a “guest”.
CULTURE Xen and the various versions of Linux
Xen was initially developed as a set of patches that lived out of the official tree, and not integrated to the Linux kernel. At the same time, several upcoming virtualization systems (including KVM) required some generic virtualization-related functions to facilitate their integration, and the Linux kernel gained this set of functions (known as the paravirt_ops or pv_ops interface). Since the Xen patches were duplicating some of the functionality of this interface, they couldn't be accepted officially.
Xensource, the company behind Xen, therefore had to port Xen to this new framework, so that the Xen patches could be merged into the official Linux kernel. That meant a lot of code rewrite, and although Xensource soon had a working version based on the paravirt_ops interface, the patches were only progressively merged into the official kernel. The merge was completed in Linux 3.0.
→ http://wiki.xensource.com/xenwiki/XenParavirtOps
Although Squeeze is based on version 2.6.32 of the Linux kernel, a version including the Xen patches from Xensource is also available in the linux-image-2.6-xen-686 and linux-image-2.6-xen-amd64 packages. This distribution-specific patching means that the available featureset depends on the distibution; discrepancies in the versions of the code, or even integration of code still under development into some distributions also mean differences in the supported features. This problem should be greatly reduced now that Xen has been officially merged into Linux.
→ http://wiki.xen.org/xenwiki/XenKernelFeatures
Using Xen under Debian requires three components:
NOTE Architectures compatible with Xen
Xen is currently only available for the i386 and amd64 architectures. Moreover, it uses processor instructions that haven't always been provided in all i386-class computers. Note that most of the Pentium-class (or better) processors made after 2001 will work, so this restriction won't apply to very many situations.
CULTURE Xen and non-Linux kernels
Xen requires modifications to all the operating systems one wants to run on it; not all kernels have the same level of maturity in this regard. Many are fully-functional, both as dom0 and domU: Linux 2.6 (as patched by Debian) and 3.0, NetBSD 4.0 and later, and OpenSolaris. Others, such as OpenBSD 4.0, FreeBSD 8 and Plan 9, only work as a domU.
However, if Xen can rely on the hardware functions dedicated to virtualization (which are only present in more recent processors), even non-modified operating systems can run as domU (including Windows).
The hypervisor itself. According to the available hardware, the appropriate package will be either xen-hypervisor-4.0-i386 or xen-hypervisor-4.0-amd64.
A kernel with the appropriate patches allowing it to work on that hypervisor. In the 2.6.32 case relevant to Squeeze, the available hardware will dictate the choice among the various available xen-linux-system-2.6.32-5-xen-* packages.