||The lead section of this article may need to be rewritten. (August 2012)|
|Developer(s)||Community project, supported by Parallels, Inc.|
|License||GNU GPL v.2|
OpenVZ (Open Virtuozzo) is an operating system-level virtualization technology based on the Linux kernel and operating system. OpenVZ allows a physical server to run multiple isolated operating system instances, known as containers, Virtual Private Servers (VPSs), or Virtual Environments (VEs). It is similar to FreeBSD jails and Solaris Zones.
OpenVZ compared to other Virtualization Technologies
OpenVZ is not true virtualization but really containerization like FreeBSD Jails. Technologies like VMWare and Xen are more flexible in that they virtualize the entire machine and can run multiple operating systems and different kernel versions. OpenVZ uses a single patched Linux kernel and therefore can run only Linux, all containers share the same architecture and kernel version. However because it doesn't have the overhead of a true hypervisor, it is very fast and efficient. The disadvantage with this approach is the single kernel. All guests must function with the same kernel version that the host uses.
The advantages, however, are that memory allocation is soft in that memory not used in one virtual environment can be used by others or for disk caching. OpenVZ uses a common file system so each virtual environment is just a directory of files that is isolated using chroot, newer versions of OpenVZ also allow the container to have its own file system. Thus a virtual machine can be cloned by just copying the files in one directory to another and creating a config file for the virtual machine and starting it.
The OpenVZ kernel is a Linux kernel, modified to add support for OpenVZ containers. The modified kernel provides virtualization, isolation, resource management, and checkpointing. As of vzctl 4.0, OpenVZ can work with unpatched Linux 3.x kernels, with a reduced feature set.
Virtualization and isolation
Each container is a separate entity, and behaves largely as a physical server would. Each has its own:
- Process tree
- A container only sees its own processes (starting from init). PIDs are virtualized, so that the init PID is 1 as it should be.
- Virtual network device, which allows a container to have its own IP addresses, as well as a set of netfilter (
iptables), and routing rules.
- If needed, any container can be granted access to real devices like network interfaces, serial ports, disk partitions, etc.
OpenVZ resource management consists of three components: two-level disk quota, fair CPU scheduler, and user beancounters. These resources can be changed during container run time, eliminating the need to reboot.
Two-level disk quota
Each container can have its own disk quotas, measured in terms of disk blocks and inodes (roughly number of files). Within the container, it is possible to use standard tools to set UNIX per-user and per-group disk quotas.
The CPU scheduler in OpenVZ is a two-level implementation of fair-share scheduling strategy.
On the first level, the scheduler decides which container it is to give the CPU time slice to, based on per-container cpuunits values. On the second level the standard Linux scheduler decides which process to run in that container, using standard Linux process priorities.
It is possible to set different values for the CPUs in each container. Real CPU time will be distributed proportionally to these values.
Strict limits, such as 10% of total CPU time, are also possible.
Each container is assigned an I/O priority, and the scheduler distributes the available I/O bandwidth according to the priorities assigned. Thus no single container can saturate an I/O channel.
User Beancounters is a set of per-container counters, limits, and guarantees. There is a set of about 20 parameters which is meant to control all the aspects of container operation. This is meant to prevent a single container from monopolizing system resources.
These resources primarily consist of memory and various in-kernel objects such as IPC shared memory segments, and network buffers. Each resource can be seen from /proc/user_beancounters and has five values associated with it: current usage, maximum usage (for the lifetime of a container), barrier, limit, and fail counter. The meaning of barrier and limit is parameter-dependent; in short, those can be thought of as a soft limit and a hard limit. If any resource hits the limit, the fail counter for it is increased. This allows the owner to detect problems by monitoring /proc/user_beancounters in the container.
|lockedpages||The memory not allowed to be swapped out (locked with the mlock() system call), in pages.|
|shmpages||The total size of shared memory (including IPC, shared anonymous mappings and tmpfs objects) allocated by the processes of a particular VPS, in pages.|
|privvmpages||The size of private (or potentially private) memory allocated by an application. The memory that is always shared among different applications is not included in this resource parameter.|
|numfile||The number of files opened by all VPS processes.|
|numflock||The number of file locks created by all VPS processes.|
|numpty||The number of pseudo-terminals, such as an ssh session, the screen or xterm applications, etc.|
|numsiginfo||The number of siginfo structures (essentially, this parameter limits the size of the signal delivery queue).|
|dcachesize||The total size of dentry and inode structures locked in the memory.|
|physpages||The total size of RAM used by the VPS processes. This is an accounting-only parameter currently. It shows the usage of RAM by the VPS. For the memory pages used by several different VPSs (mappings of shared libraries, for example), only the corresponding fraction of a page is charged to each VPS. The sum of the physpages usage for all VPSs corresponds to the total number of pages used in the system by all the accounted users.|
|numiptent||The number of IP packet filtering entries|
Checkpointing and live migration
A live migration and checkpointing feature was released for OpenVZ in the middle of April 2006. This makes it possible to move a container from one physical server to another without shutting down the container. The process is known as checkpointing: a container is frozen and its whole state is saved to a file on disk. This file can then be transferred to another machine and a container can be unfrozen (restored) there; the delay is roughly a few seconds. Because state is usually preserved completely, this pause may appear to be an ordinary computational delay.
OpenVZ distinct features
||This article appears to be written like an advertisement. (December 2012)|
The virtualization overhead observed in OpenVZ is minimal.
By decreasing the overhead required for each container, it is possible to serve more containers from a given physical server, so long as the computational demands do not exceed the physical availability.
An administrator (i.e. root) of an OpenVZ physical server (also known as a hardware node or host system) can see all the running processes and files of all the containers on the system, and this has convenience implications. Some fixes (such as a kernel update) will affect all containers automatically, while other changes can simply be "pushed" to all the containers by a simple shell script.
Similar technologies↑Jump back a section
By default, OpenVZ restricts container access to real physical devices (thus making a container hardware-independent). An OpenVZ administrator can enable container access to various real devices, such as disk drives, USB ports, PCI devices  or physical network cards.
/dev/loopN is often restricted in deployments (as loop devices use kernel threads which might be a security issue), which restricts the ability to mount disk images. A work-around is to use FUSE.
OpenVZ is limited to the providing only some VPN technologies based on PPP (such as PPTP/L2TP) and TUN/TAP. IPsec is supported inside containers since kernel 2.6.32.
See also↑Jump back a section
- Kolyshkin, Kir (06 October 2012). "OpenVZ turns 7, gifts are available!". OpenVZ Blog. Retrieved 2013-01-17.
- HPL-2007-59 technical report, http://www.hpl.hp.com/techreports/2007/HPL-2007-59R1.html?jumpid=reg_R1002_USEN
- vzctl(8) man page, Device access management subsection, http://wiki.openvz.org/Man/vzctl.8#Device_access_management
- vzctl(8) man page, PCI device management section, http://wiki.openvz.org/Man/vzctl.8#PCI_device_management
- vzctl(8) man page, Network devices section, http://wiki.openvz.org/Man/vzctl.8#Network_devices_control_parameters