1. Overview
vSphere’s live migration (vMotion) allows you to move an entire running virtual machine from one physical server to another, without downtime. The virtual machine retains its network identity and connections, ensuring a seamless migration process. Transfer the virtual machine’s active memory and precise execution state over a high-speed network (vMotion network), allowing the virtual machine to switch from running on the source vSphere host to the destination vSphere host. This entire process could take less than two seconds on a gigabit Ethernet network depending on the workload.
VMware vSphere® 5.1 vMotion presents a unified and integrated migration architecture that migrates live virtual machines, including their memory and storage, between vSphere hosts without any requirement for shared storage. When a virtual machine needs to be migrated between hosts that do not share storage, vMotion copies the data across the vMotion network using some of the Storage vMotion code. This shared-nothing live migration feature offers administrators significantly more simplicity and flexibility in managing and moving virtual machines across their virtual infrastructures compared to the traditional vMotion and Storage vMotion migration solutions.
Prior to 5.1, vMotion and Storage vMotion technologies worked together to provide the ability to migrate the entire state of the virtual machine, including its memory and storage, across vSphere hosts. However, this live-migration solution was limited to the hosts that shared a common set of datastores. In addition, migration of an entire virtual machine required two separate operations, for instance vMotion followed by Storage vMotion, or vice versa.
2. Configuration
2.1 Enable vMotion
Enable vMotion functionality of the virtual switch through the vSphere client for each host:
2.2 Set up shared datastore (prior to vSphere 5.1)
The entire state of a virtual machine is encapsulated by a set of files stored on shared storage such as Fibre Channel oriSCSI Storage Area Network (SAN) or Network Attached Storage (NAS). As a cluster file system, VMFS lets multiple ESX/ESXi hosts access the same VMFS datastore concurrently. VMFS provides distributed locking for your virtual machine files, so that your virtual machines can operate safely in a SAN environment where multiple ESX/ESXi hosts share the same VMFS datastore.
Disk files for a virtual machine are not relocated when it is migrated using vMotion. Any RDMs remain as RDMs when the virtual machine is registered to another host. That is, no changes to the virtual machine itself are made.
The actual setup work simply means some LUN masking on array side to make devices visible to the ESX hosts and rescanning of the storage adapters. vSphere automatically discovers out the datastores on the newly detected LUNs.
For RDM (either physical or virtual mode) devices, make sure they have the same exact LUN id (address) to each of the hosts sharing the same devices, otherwise the migration will report the following error due to the mismatched VML identifiers (see VMware KB 1016210): Virtual Disk ‘X’ is a mapped direct access LUN that is not accessible
3. How it works
Ref: VMware vSphere® 5.1 vMotion Architecture, Performance and Best Practices
The active memory and precise execution state of the virtual machine is rapidly transferred over a high speed network (vMotion network), allowing the virtual machine to instantaneously switch from running on the source ESX host to the destination ESX host.
3.1 Migration of Virtual Machine’s Memory
http://blogs.vmware.com/vsphere/2011/02/vmotion-whats-going-on-under-the-covers.html
3.1.1 vMotion steps (iterative memory copy approach, aka “Pre-copy”), at a high level (vSphere 4.1):
1). Shadow VM created on the destination host.
2). Copy each memory page from the source to the destination via the vMotion network. This is known as preCopy.
3). Perform another pass over the VM’s memory, copying any pages that changed during the last preCopy iteration.
4). Continue this iterative memory copying until no changed pages (outstanding to be-copied pages) remain.
5). Stun the VM on the source and resume it on the destination.
3.1.2 Unified vMotion of memory in vSphere 5.1: (Insight Into vMotion: Architectures, Performance, Best Practices, and Futures)
Both Traditional vMotion and vMotion in vSphere 5.1 use the same iterative copy approach (aka memory “Pre-copy”)
- Trace and copy the memory in an iterative fashion
- Copy the modified memory during subsequent iterations
vMotion in vSphere 5.1 does more than Traditional vMotion
- Needs to handle ongoing IO writes to disk
- Memory Copy process works concurrently with IO Mirroring process (after bulk copy process)
- Both memory and disk changes are sent on same vMotion network
3.1.3 Atomic switch-over:
vSphere 5.1 vMotion guarantees atomic switch-over between source and destination hosts by ensuring both memory and disk state of the virtual machine are in lock-step before switch-over, and fails back to source host and source disks in the event of any unexpected failure during disk or memory copy.
Note that, to avoid overhead as much as possible, the storage network will be leveraged if both the source and destination host have access to the destination datastore. For instance, if a virtual machine resides on a local datastore and needs to be copied to a datastore located on a SAN, vMotion will use the storage network to which the source host is connected. In essence a Storage vMotion is used to avoid vMotion network utilization and additional host CPU cycles. Also if both the source and destination host are connected to the destination datastore and the datastore is located on an array that has VAAI enabled, Storage vMotion will offload the copy process to the array.
3.2 Migration of Virtual Device State
Like vMotion, vSphere 5.1 vMotion leverages the ability of vSphere to serialize the virtual device state of the virtual machine, which is generally less than 8MB in size. In some cases, it can grow in excess of 128MB, which can be transferred very quickly over a high-speed network.
3.3 Migration of External Network Connections
The vSphere virtual networking architecture makes it very easy to preserve existing networking connections even after a virtual machine is migrated to a different machine. Because each virtual Ethernet network interface card (vNIC) has its own MAC address that is independent of the physical NIC’s MAC address, the virtual machines can be migrated between the hosts and still keep the networking connections alive as long as both the source and destination hosts are on the same subnet. The migration is done efficiently so the networking connections do not time out. After the virtual machine is migrated, the destination ESXi host sends out a RARP packet to the physical network switch thereby ensuring that the switch updates its tables with the new switch port location of the migrated virtual machine. This migration is completely transparent to the remote clients of the virtual machine.
Note that http://frankdenneman.nl/2012/09/07/vsphere-5-1-vmotion-deepdive/ also provides some more details.
4. Convergence issue
In most circumstances the iterative copy works very well, assuming the host is able to transmit memory pages over the vMotion network faster than the VM can dirty new pages.
However, in the rare event that the VM is dirtying memory pages faster than vMotion can send them it is possible to get into as situation where the preCopy won’t be able to converge.
When the preCopy cannot converge, vMotion needs to decide whether to fail the vMotion or to proceed with switchover to the destination anyway. It makes this decision by estimating the time required to transmit all the remaining outstanding pages. By default, if this time is below 100 seconds vMotion will proceed with the switchover. If it will take more than 100 seconds the vMotion will fail (timeout) with no impact on the VM.
In the event the VM passes the 100 second check, VMotion will stun the source and start running on the destination. While the destination runs, the source will transmit the remaining pages to the destination using the “quick resume” capability introduced with vSphere 4.1.
The default 100 seconds can be tuned via vmotion.maxSwitchoverSeconds.
In summary:
- You should be able to vMotion any workload as long as it is dirtying memory pages at a rate that is less than your vMotion network transmit rate.
- vMotion will only transfer the VM to the destination if it is certain that it can complete the memory copy.
- If vMotion cannot complete the memory copy it will fail with no impact to the running VM.
4.1 Quick Resume (vSphere 4.1)
http://www.yellow-bricks.com/2011/04/13/vmotion-and-quick-resume/
Quick Resume is a feature that allows you to vMotion a virtual machine which has a high memory page change rate. Basically when the change rate of your memory pages exceeds the capabilities of your network infrastructure you could end up in a scenario where vMotioning a virtual machine would fail as the change rate would make a switch-over impossible. With Quick Resume this has changed.
Quick Resume enables the source virtual machines to be stunned while starting the destination virtual machine before all pages have copied. However, as the virtual machine is already running at the destination it could possibly attempt to touch (read or write) a page which hasn’t been copied yet. In that case Quick Resume requests the page from the source to allow the guest to complete the action while continuously copying the remaining memory pages until all pages are migrated. But what if the network would fail at that point, wouldn’t you end up with a destination virtual machine which cannot access certain memory pages anymore as they are “living” remotely? Just like Storage IO Control, vMotion leverages shared storage. A special file would be created in the case Quick Resume is used and this file is basically used as a backup buffer. In the case the network would fail this file would allow for the migration to complete. This file is typically in the order of just a couple MBs. Besides being used as a buffer for transferring the memory pages it also enables bi-directional communication between the two hosts allowing the vMotion to complete as though the network hadn’t failed. Is that cool or what?
The typical question that arises immediately is if this will impact performance? It is good to realize that without Quick Resume vMotioning large memory active virtual machines would be difficult. The switch-over time could potentially be too large and lead to temporary loss of connection with the virtual machine. Although Quick Resume will impact performance when pages that are not copied yet are accessed, the benefits of being able to vMotion very large virtual machines with minimal impact by far outweigh this temporary increase of memory access time.
4.2 Stun During Page Send (SDPS, vSphere 5.0)
http://www.yellow-bricks.com/2011/07/20/vsphere-50-vmotion-enhancements/
Quick Resume is a vSphere 4.1 vMotion enhancement and now it is replaced with Stun During Page Send, or also often referred to as “Slowdown During Page Send” is a feature that “slowsd own” the vCPU of the virtual machine that is being vMotioned. Simply said, vMotion will track the rate at which the guest pages are changed, or as the engineers prefer to call it, “dirtied”. The rate at which this occurs is compared to the vMotion transmission rate. If the rate at which the pages are dirtied exceeds the transmission rate, the source vCPUs will be placed in a sleep state to decrease the rate at which pages are dirtied and to allow the vMotion process to complete. It is good to know that the vCPUs will only be put to sleep for a few milliseconds at a time at most. SDPS injects frequent, tiny sleeps, disrupting the virtual machine’s workload just enough to guarantee vMotion can keep up with the memory page change rate to allow for a successful and non-disruptive completion of the process. You could say that, thanks to SDPS, you can vMotion any type of workload regardless of how aggressive it is.
It is important to realize that SDPS only slows down a virtual machine in the cases where the memory page change rate would have previously caused a vMotion to fail.
This technology is also what enables the increase in accepted latency for long distance vMotion. Pre-vSphere 5.0, the maximum supported latency for vMotion was 5ms. As you can imagine, this restricted many customers from enabling cross-site clusters. As of vSphere 5.0, the maximum supported latency has been doubled to 10ms for environments using Enterprise Plus. This should allow more customers to enable DRS between sites when all the required infrastructure components are available like, for instance, shared storage.
4.3 Multi-NIC vMotion Support (vSphere 5.0)
http://www.yellow-bricks.com/2011/07/20/vsphere-50-vmotion-enhancements/
One of the most substantial and visible changes is multi-NIC vMotion capabilities. vMotion is now capable of using multiple NICs concurrently to decrease the amount of time a vMotion takes. That means that even a single vMotion can leverage all of the configured vMotion NICs. Prior vSphere 5.0, only a single NIC was used for a vMotion enabled VMkernel. Enabling multiple NICs for your vMotion enabled VMkernel’s will remove some of the constraints from a bandwidth/throughput perspective that are associated with large and memory active virtual machines. The following list shows the currently supported maximum number of NICs for multi-NIC vMotion:
- 1GbE – 16 NICs supported
- 10GbE – 4 NICs supported
5. Metro vMotion
http://www.yellow-bricks.com/2011/08/03/vsphere-5-metro-vmotion/
6. Configuration Requirements
A number of specific virtual machine configurations can prevent migration of a virtual machine with vMotion. The following virtual machine configurations can prevent migration with vMotion:
- You cannot use migration with vMotion to migrate virtual machines using raw disks for clustering purposes.
- You cannot use migration with vMotion to migrate a virtual machine that uses a virtual device backed by a device that is not accessible on the destination host. (For example, you cannot migrate a virtual machine with a CD drive backed by the physical CD drive on the source host.) Disconnect these devices before migrating the virtual machine.
- You cannot use migration with vMotion to migrate a virtual machine that uses a virtual device backed by a device on the client computer. Disconnect these devices before migrating the virtual machine.
- You cannot vMotion across datacenters even managed by a single vCenter Server.
7. vMotion futures
- vMotion across vCenter Servers
- vMotion across VDC (vCloud Dirctor Virtual Datacenters)
- Long-distance vMotion (between datacenters)
