Follow

Introduction

The following guide explores the best practices and configurations for using INFINIDAT InfiniBox Active-Active Replication with VMware vSphere Metro Storage Cluster (vMSC). InfiniBox Active-Active Replication was introduced with the 5.0 release, and together with VMware vSphere provides a highly resilient solution for protecting applications data, availability and allows an extremely fast recovery even in the event of a full site failure. 

Revision

Last updated on: July 29, 2019

Target Audience

This document is intended for storage, system and VMware administrators that plan to deploy or manage InfiniBox Active-Active replication with VMware vSphere Metro Storage Cluster configuration.

The authors of this document assume that the reader is familiar with the following:

  • InfiniBox storage resources and Active-Active replication.
  • VMware vSphere, vCenter Server and High Availability (vSphere HA) solutions.

For more information and assistance with INFINIDAT InfiniBox please visit support.infinidat.com

Before You Begin

Prior to setting up InfiniBox Active-Active replication with VMware vSphere Metro Storage Cluster (vMSC) it is advised to read the INFINIDAT InfiniBox documentation for Active-Active replication and VMware vSphere® Metro Storage Cluster Recommended Practices.

Terminology

  • Active-Active volume: a volume that is undergoing Active-Active (A-A) replication.
  • Peers: a pair of volumes that are undergoing Active-Active replication relationship, are also referred as "peers".
  • Active-Active datastore: a datastore that resides on an Active-Active volume.
  • ALUA: asymmetric logical unit access. 

Prerequisites 

  • Two InfiniBox 5.0 or later systems.
  • FC connectivity. (I/O)
  • Ethernet connectivity between the InfiniBox systems. (replication)
  • Maximum of 5ms RTT latency between the InfiniBox systems.
  • For additional vSphere Metro Storage related requirements please refer to VMware documentation.

Supported vSphere versions 

See an up-to-date list on the INFINIDAT Interoperability Matrix website.

Solution Overview

Deploying VMware vSphere Metro Storage Cluster with INFINIDAT InfiniBox Active-Active replication provides a highly available and resilient solution for protecting applications availability and data, with a minimal performance impact. 

Introduction to InfiniBox Active-Active replication

InfiniBox Active-Active replication provides zero-RPO and zero-RTO, enabling mission critical business-services to keep operating even through a complete site failure:

  • Symmetric synchronous replication solution, applications can be geographically clustered.
  • Fully integrated into InfiniBox, allows simple management of application spread across data centers.

Introduction to VMware vSphere Metro Storage

A VMware vSphere Metro Storage Cluster configuration is a specific storage configuration that combines replication with array-based clustering. These solutions are typically deployed in environments where the distance between data centers is limited, often metropolitan or campus environments.

  • The primary benefit of a stretched cluster model is that it enables fully active and workload-balanced data centers to be used to their full potential and it allows for an extremely fast recovery in the event of a host or even full site failure.
  • vSphere ESXi servers are in a single vCenter cluster and can be spreaded across sites (separate datacenters or geographic areas) 

Uniform / Non-uniform Host Access types

vMSC solutions are classified into two distinct types. These categories are based on a fundamental difference in how the vSphere hosts access the storage systems, which influences design considerations.

  • Uniform host access - vSphere hosts on both sites are all connected to the storage systems across both sites. LUN paths presented to vSphere hosts are stretched across the sites.

  • Non-uniform host access - vSphere hosts at each site are connected only to the local storage system at the same site. LUN paths presented to vSphere hosts from storage nodes are limited to the local site.

InfiniBox Active-Active replication is supported with both uniform and non-uniform host access types. 

ESXi hosts and Active-Active Volumes Relationships

ESXi hosts identify both peers as the same storage device. 

Stretched Storage Architecture

With a Stretched Storage architecture there are two InfiniBox storage array linked in an Active-Active relationship, one is on site A and another on site B. The ESXi hosts on both sites can be connected to both or one of the InfiniBox systems (uniform / non-uniform). When an Active-Active datastore is provisioned to the cluster, the ESXi hosts identify both peers of the Active-Active volume as the same datastore and device. It is possible to read and write simultaneously using both of the peers, while all writes are synchronously replicating between the InfiniBox systems.

Preparing the vSphere and InfiniBox Environment

ESXi hosts should be installed and configured with a vSphere HA cluster.

Using INFINIDAT Host PowerTools for VMware

It is highly recommended to install Host PowerTools for VMware with every vSphere vCenter Server environment. Host PowerTools for VMware provides:

  • Ease-of-use.
  • Storage and host automation.
  • Best practices validation.

Fore more information on how to install and use Host PowerTools for VMware see: Host PowerTools for VMware

Setting InfiniBox Best Practices for vSphere

Validate that the vSphere cluster is configured according to the InfiniBox best practices for vSphere.

InfiniBox Active-Active volumes are supported by ESXi native multipath software.

Preparing ESXi hosts to work with Active-Active Volumes

When using Active-Active volumes on a vSphere environment it is required to set the ESXi hosts objects on InfiniBox to “ESXi” type.

  • "ESXi" type - Makes InfiniBox issue a PDL SCSI sense response to the host in case a mapped Active-Active peer no longer synchronized. (cannot serve R/W IO, becomes "offline")
  • If the host type is not set to "ESXi", vSphere HA cannot properly detect that a mapped Active-Active peer is no longer synchronized ("offline").
    • Therefore, in a non-uniform environment, vSphere HA will not try to recover affected VMs that are running on hosts which can access only the "offline" peer.
    • In a uniform environment, although all hosts should have access to both peers, settings this is still required for proper functioning of the ESXi servers.

Setting the ESXi hosts objects on InfiniBox to the “ESXi” type is further discussed later on the "Setting up the ESXi Hosts on InfiniBox" section.

Active-Active link between the InfiniBox systems

The InfiniBox systems should be configured with a replication link.

Configuring vMSC and Active-Active Datastores

Prior to configuring vMSC and InfiniBox Active-Active Datastores, ensure that the environment is configured according to requirements on the "Preparing the vSphere and InfiniBox Environment" section above.

Follow the instruction on the following chapters based on the desired host access configuration. (uniform / non-uniform)

  • The vMSC configuration depends on the host access type.
  • If uniform access is desired, FC connectivity between the sites is required (used for I/O) in addition to the Ethernet connectivity used for replication.
  • The guide will explore the configuration process mostly using the InfiniBox Web-GUI.

vMSC with Uniform Host Access

When configuring uniform host access, the ESXi hosts can access the same datastore through both InfiniBox systems - the system that exists on the same site (local) and the remote system on the other site. 

  • Typically, the datastore paths to the remote system will be less optimized than the paths to the datastore on the local system, due to the extra travel between the sites which carries added latency.
    • The InfiniBox system can intelligently hint the ESXi hosts about which are the optimal paths to serve I/O. This is further discussed later on. 
  • FC connectivity between the site is required.

Setting up the ESXi Hosts on InfiniBox

Connect all the ESXi hosts in the vSphere cluster to both InfiniBox systems. 

  • For each host it is recommended to have one path to each InfiniBox node (the nodes of both systems) from two initiators. (12 paths in total)
  • The hosts can access both InfiniBox systems and will be able to see paths to the datastore from both systems.

The steps below describe how to create (register) the vSphere cluster on InfiniBox

The simplest method to register a uniform vSphere cluster is by using Host PowerTools for VMware. For instruction see: Registering ESXi hosts and clusters

  • If Host PowerTools for VMware is used to register the vSphere cluster, sections "Creating InfiniBox Hosts objects" and "Creating InfiniBox Cluster object" can be skipped.
  • Make sure that the vSphere cluster is registered on both systems.

Creating InfiniBox Hosts objects

Once all hosts are connected (zoned), a corresponding InfiniBox host object should be created for each of the ESXi hosts, on both systems. 

In order to create the InfiniBox hosts objects:

  1. Login to one of the InfiniBox systems using the management console.
  2. Create a new host object. 
    • Add the corresponding host's ports (initiators).
  3. Set the created host object type to "ESXi".
    • This is settable only using InfiniShell. Use the following command: "host.set_host_type host=<esxi-host-name> host_type=ESXi"

    • When creating the hosts using InfiniShell it is possible to set the host type on creation, using the "host_type=ESXi" argument 
  4. Repeat steps 2-3 until all the ESXi host are created. 
  5. Once all hosts are created, repeat the steps above on the other system.

Upon completion, all the vSphere cluster's hosts should exists as an InfiniBox host objects on both system.

Creating InfiniBox Cluster object

Once all host are created, add them to an InfiniBox cluster object.

  • The InfiniBox cluster object aggregates host objects and enables to map volumes to all of the cluster's hosts at once. 

In order to create the InfiniBox cluster object:

  1. Select the "Host & Cluster" icon on the left bar.
  2. Click the "Create" button and select "Cluster".
  3. Add all the previously created hosts to the cluster. 
  4. Repeat steps on the other system.

Upon completion, all the vSphere hosts objects should be added to an InfiniBox cluster object on both system.

Setting the optimized InfiniBox system for each host

The InfiniBox system can intelligently hint using ALUA the ESXi hosts about which are the optimal paths to serve I/O to each volume.

  • ALUA is a standard used for identifying paths prioritization between the storage and hosts, and enables the initiators to query the target about path attributes, such as the paths ALUA state.

This settings is controlled by an InfiniBox Host object option, which sets the host's "Optimized / Non-Optimized" setting.

  • By default host objects are created as "Optimized".
  • The InfiniBox system hints the ESXi hosts by setting the hosts's mapped volumes paths ALUA state to "Optimized / Non-Optimized".

Settings this properly is crucial when configuring vMSC with a Uniform host access, as the ESXi hosts are presented with datastore paths also from the remote InfiniBox system, which are typically less optimal. 

  • Not required for a Non-uniform configuration.

InfiniBox Host objects Optimize/Non-optimized configuration 

Configure the InfiniBox Host objects as follow:

  • On InfiniBox - Site A:
    • Ensure that the hosts which are located on Site A are set to "Optimized".
    • The hosts that are located on Site B should be set to "Non-Optimized".
  • On InfiniBox - Site B:
    • Ensure that the hosts which are located on Site B are set to "Optimized".
    • The hosts that are located on Site A should be set to "Non-Optimized".

Settings the InfiniBox hosts to Optimized / Non-Optimized 

The simplest method to set a vSphere host to Optimized/Non-optimize is by using Host PowerTools for VMware.

In order to set a host to "Optimized / Non-Optimized":

  1. Login to the InfiniBox system using the management console.
  2. Select the "Host & Cluster" icon on the left bar.
  3. Right click on a Host object.
  4. Select "Modify Host".
  5. Set the "Path ALUA state" to the proper option. 
  • Ensure proper configuration for all ESXi hosts on both InfiniBox system. 


vMSC with Non-uniform Host Access

When configuring non-uniform host access, the ESXi hosts on each site can access the storage only through the local InfiniBox system - the system that exists on the same site (local).

Setting up the ESXi Hosts on InfiniBox

Connect the ESXi hosts in each site only to the local InfiniBox systems. 

  • For each host it is recommended to have one path to each InfiniBox node from two initiators. (6 paths in total)
  • The hosts on each site can access only the local InfiniBox system and therefore will be able to see paths to the datastore only from the local system.

Creating InfiniBox Hosts objects

After the hosts are connected, in order to be able to provision storage to the vSphere cluster, a corresponding InfiniBox host object must be created for each of the ESXi hosts.

  • Each host object contains the host’s initiators ports, which then can be mapped to a volume.

In order to create the InfiniBox hosts objects:

  1. Login to the InfiniBox system on Site A using the management console.
  2. Create a new host object. 
  3. Set the created host object type to "ESXi".
    • This is settable only using InfiniShell. 

      • Use the following command: "host.set_host_type host=<esxi-host-name> host_type=ESXi"

    • When creating the hosts using InfiniShell it is possible to set the host type on creation, using the "host_type=ESXi" argument
  4. Repeat steps 2-3 until all the ESXi hosts that reside on the same site are created. 
  5. Repeat the steps on the InfiniBox system in Site B.

Upon completion, all the host in Site A should exist on the system in Site A and all the hosts in Site B should exist on the system in Site B.

Site A:

Site B:

Creating InfiniBox Cluster object

Once all host are created, add them to an InfiniBox cluster object. The InfiniBox cluster object aggregates host objects and enables to map volumes to all of the cluster's hosts at once. 

In order to create the InfiniBox cluster object:

  1. Login to the InfiniBox system on Site A using the management console.
  2. Select the "Host & Cluster" icon on the left bar.
  3. Click the "Create" button and select "Cluster".
  4. Add all the previously created hosts to the cluster. 
  5. Repeat steps on the system in Site B.

Upon completion, all the vSphere hosts objects should be added to an InfiniBox cluster object on both system.

  • Each InfiniBox cluster represent the hosts which reside on the same site. 

Site A: 

Site B:

Provisioning Active-Active Datastores

In case virtual machines are designed to simultaneously run in both sites, it is advised to provision at least two Active-Active datastores.

Virtual machines in each site should reside on an Active-Active datastore which his preferred peer is set on the local InfiniBox system. (the system on the same site)

  1. Login to one of the InfiniBox systems using the management console.
  2. Create two new volume.
  3. Configure Active-Active replication to the remote system on one of the previously created volumes.
    • Keep the Preferred system option as Local.
    • Upon success, an Active-Active replication is set and a volume peer is created on the remote system.
  4. Configure Active-Active replication to the remote system also on the other volume. 
    • This time set the Preferred system option to Remote.

  5. Upon success two Active-Active replications are created. 
    • For one "System A" is set as preferred and for the other "System B" .
  6. Map the two Active-Active volume to the previously created InfiniBox cluster object.

Mapping the other peer(s)

  1. Login to the other InfiniBox system using the management console.
  2. Map the Active-Active volume peers to the previously created InfiniBox cluster object.
    • It is recommended to map each peer using the same LUN ID in both systems. 
  3. Login to the vSphere Web Client.
  4. Perform a storage rescan on the vSphere cluster.

The Active-Active volumes are now presented to each ESXi host (in both sites), from both InfiniBox systems. (uniform access)  

Ensuring proper volumes access

Ensure that the hosts properly see the paths to the mapped Active-Active volume.

  • The paths state and the number of paths to the Active-Active volume depends on the host access type. 
  • Optimized paths are presented as "Active (I/O)", which are the paths to the local InfiniBox system. 
    • As long as there are "Active (I/O)" paths available, All R/W IO will go through these paths which provide optimal performance.
  • Non-optimized are presented as "Active", which are the paths to the InfiniBox system in the other site. 

Non-uniform Host Access 

It is recommended that each host will have one path to each of its local InfiniBox nodes from two initiators. (six paths in total)

  • In the screenshot below, there is an InfiniBox Active-Active volume (storage device) which is mapped to an ESXi host. 
  • The Active-Active volume is presented to the host only from the local system, therefore all the paths are in "Active (I/O)" status which indicates they are optimized. 

Uniform Host Access 

It is recommended that each host will have one path to each InfiniBox node (the nodes of both systems) from two initiators. (Twelve paths in total, six to each system)

  • In the screenshot below, there is an InfiniBox Active-Active volume (storage device) which is mapped to an ESXi host from both systems.
    • The six "Active (I/O)" paths are the optimized path to the local InfiniBox system where the ESXi host is set as optimized.
    • The other "Active" path are non-optimized to the remote InfiniBox system and will be used only in case all the optimized paths are down.

Creating VMFS

Create a VMFS Datastore over the Active-Active volume (storage device). 

  • Located the vSphere cluster "New Datastore..." Wizard:

  • Select the Active-Active volume LUN and walk-through the wizard. 

Upon success the created datastore should be mounted on all hosts. 

The same Active-Active datastore can perform reads and writes through both systems simultaneously

vSphere Cluster Configuration

The following settings configuration is recommended for the vMSC vSphere Cluster:

vSphere Availability 

Validate that the vSphere HA is turned-on. ("TURN ON vSphere HA" is checked)

  • Allows automatic virtual machines fail-over in the case of a failure. 
  • It is advised to follow VMware best practices for vSphere HA configuration. For more information please refer to VMware documentation.

Failures and Responses

Ensure that "Enable Host Monitoring" is enabled. 

  • Host Failure Response: Restart VMs.
  • Datasore with PDL: Power off and restart VMs.
  • Datasore with APD: Power off and restart VMs - conservative or aggressive restart policy. 

VM/Host Groups and Rules

vSphere HA enables automatic virtual machines fail-over by restarting the virtual machines of a failed host on another host that can access the datastore.

  • Thus, a virtual machine can reside on a host in Site A, and in case of failure to be recovered (restarted) on a host in Site B.

Therefore, It is recommended and beneficial to configure vSphere HA to choose the preferred hosts on which to restart recovered VMs - typically hosts on the same site.

  • This is done using a VM/Host rules. 

vSphere HA fail-over and recovery process has nothing special or different when used with InfiniBox Active-Active replication.

VM/Host Groups

First, create groups for the Hosts and VMs.

  • Typically two Host groups, each group contains the hosts that reside on the same site.
  • VM groups can be created as desired. 
    • In the example we will create also two VM groups. One for VMs that should reside on Site A and another one for Site B. 

VM/Host Rules

Once the Hosts and VMs groups are created, they can be associated with VM/Host rules.

  • In the following example, two rules are created to set VMs in group "VM Group Site A" to reside on hosts in group "Hosts Group Site A".
    • It is recommended to use the “should” rule, so in the case there are no hosts available in the associated group, the rule can be broken.

Then another rule is created to set VMs in group "VM Group Site B" to reside on hosts in group "Hosts Group Site B".

Virtual Machines and Datastore Placement

As previously discussed, the virtual machines in each site should reside on an Active-Active datastore which his preferred peer is set on the local InfiniBox system. (the system on the same site)

  • The virtual machines on group A (VMs that are running on hosts in Site A) should reside on the Active-Active datastore that his peer on system A  is set as preferred. (the peer on InfiniBox in Site A)
  • The virtual machines on group B (VMs that are running on hosts in Site B) should reside on the Active-Active datastore that his peer on system B  is set as preferred. (the peer on InfiniBox in Site B)

vMSC with Active-Active Datastores Configuration Diagram

Uniform Host Access

Non-uniform Host Access

Failure Scenarios and Expected Response

The following chapter underlines how InfiniBox Active-Active replication and the vSphere environment behave in case of a failure. Failures can be caused by two main factors:

  • Failure of the storage array, the SAN fabric connectivity or the replication link. 
  • Failure of ESXi host(s). 

ESXi Host Failures

When an ESXi host failure is detected by vSphere HA , its virtual machines will be recovered and restarted on other ESXi hosts in the vSphere cluster. This is the typical vSphere HA response, regardless of the specific vMSC uniform or non-uniform host access configuration type.

Multiple Hosts Failure 

In case all the ESXi hosts in a specific site fail, vSphere HA can quickly recover and restart the failed virtual machine on the ESXi hosts in the remote site.

  • This high level of resiliency is achieved thanks to the fact that the datastores are undergoing InfiniBox Active-Active replication (stretched) and presented to the ESXi hosts on both sites.

Storage Array Failures

In case of a complete storage array or a SAN fabric failure scenario, the vSphere HA response depends on the vMSC host access configuration. (uniform / non-uniform).

Complete Storage Failures on Non-uniform Configuration 

On non-uniform configuration, the InfiniBox system in each site presents each Active-Active datastore peer only to the local ESXi hosts that reside in the same site.

  • Each host can only see paths to the local Active-Active peer. (the peer that resides on the InfiniBox system in same site)

Therefore, if from any reason an InfiniBox system becomes inaccessible for the ESXi hosts in a specific site, vSphere would need to recover the failed virtual machines on the hosts in the remote site. The following failure scenarios in a specific site would lead to that result: 

  • Loss of all SAN fabric connectivity.
  • Brutally unmaping an Active-Active peer on InfiniBox while virtual machines that reside on that peer are powered-on. 
  • Failure of the InfiniBox system.

Failure Scenario Example

The following example will explore the scenario of access loss to a peer while virtual machines that reside on that peer are powered-on. Environment configuration:

  • 2 InfiniBox systems:
    • System-Site-A: in Site A.
    • System-Site-B: in Site B.
  • 2 Active-Active datastores. 
    • "Active-Active_Datastore1": preferred set on System in Site-A.
      • Reside on Active-Active volume named: "Active-Active_Datastore1"
    • "Active-Active_Datastore2": preferred set on System in Site-B.
      • Reside on Active-Active volume named: "Active-Active_Datastore2"
  • 4 Hosts, in 2 sites (Site A/B), divided to two Hosts groups:
    • Host Group Site A: 2 Hosts 
    • Host Group Site B: 2 Hosts
  • Non-uniform connection:
    • Hosts in site A can only see the peers on the InfiniBox system in Site A.
    • Hosts in site B can only see the peers on the InfiniBox system in Site B.
  • 8 VMs, divided to two VM groups:
    • VM Group Site A: 4 VMs 
    • VM Group Site B: 4 VMs
  • 2 VM/Host rules:
    • VMs that are members of the "VM Group Site A" should run on hosts that are members of the "Host Group Site A".
    • VMs that are members of the "VM Group Site B" should run on hosts that are members of the "Host Group Site B".
  • VMs in group "VM Group Site A" reside on "Active-Active_Datastore1". 
  • VMs in group "VM Group Site B" reside on "Active-Active_Datastore2". 

All virtual machines are powered-on. 

  • VMs that are housed on hosts in Site A and reside on datastore "Active-Active_datastore1"

  • VMs that are housed on hosts in Site B and reside on datastore "Active-Active_datastore2"

Active-Active replication is set.

Failure

System-Site-B experienced a failure which caused a loss of access to the local Active-Active peers. (the peers on Site B of both Active-Active volumes).

Results

Initially:

  • The virtual machines that are housed on hosts in Site B (reside on "Active-Active_Datastore2"), become inaccessible. 
    • The virtual machines on Site B are expected to be inaccessible until the APD timeout has been reached (If APD timeout is enabled).
  • While Virtual machines that are housed on hosts in Site A (reside on "Active-Active_Datastore1"), continue to run without any interruption.
  • Looking at the datastore view from Site B hosts would show that both datastores are inaccessible. 
    • Due to the loss of access to the local system) 
  • While looking at the datastore view from Site A hosts would show that both datastores are still available.
    • Due to the fact that the Active-Active datastores' peers on Site A are still accessible for the hosts in Site A through the local InfiniBox system in Site A.
  • When the APD timeout is reached (If APD timeout is enabled), vSphere HA will shut down the inaccessible VMs and recover (restart) them on hosts in Site A, which can still access the datastore through the local peer.
    • The recovered VMs will be back online and powered-on on Site A hosts.

Once the InfiniBox system on Site B failure is resolved, the peer will be re-synchronized automatically and the datastores be back again accessible to the hosts in Site B.

It is recommended to ensure that the VMs are migrated back to hosts in Site B.

Complete Loss of a Access to a Storage Array on Uniform Configuration 

With a uniform configuration, the InfiniBox system in each site presents its Active-Active datastore peer to the ESXi hosts in both sites.

  • All the ESXi hosts can access Active-Active datastores through the local and remote peers. (and can see paths to both systems) 

Therefore, uniform configuration provides an even greater level of availability for virtual machines, as it can sustained a complete loss of access to an InfiniBox system. 

Accordingly, if from any reason one of InfiniBox system becomes inaccessible for the ESXi hosts in a specific site, while the other InfiniBox system is still accessible, the virtual machines will keep and run on their hosts non-disruptively.

  • Thus, in this case, the downtime until the failed VMs are restarted on the remote hosts by vSphere HA is spared. 

In case both InfiniBox systems become inaccessible for all the ESXi hosts in a specific site, the vSphere HA behavior will be similar to the one explored in the non-uniform section.

Failure Scenario Example

Environment configuration:

  • 2 InfiniBox systems:
    • System-Site-A: in Site A.
    • System-Site-B: in Site B.
  • 2 Active-Active datastores. 
    • "Active-Active_Datastore1": preferred set on System in Site-A.
      • Reside on Active-Active volume named: "Active-Active_Datastore1"
    • "Active-Active_Datastore2": preferred set on System in Site-B.
      • Reside on Active-Active volume named: "Active-Active_Datastore2"
  • 4 Hosts, in 2 sites (Site A/B), divided to two Hosts groups:
    • Host Group Site A: 2 Hosts 
    • Host Group Site B: 2 Hosts
  • Uniform connection:
    • Hosts in site A can access both Active-Active peers on the local and remote InfiniBox systems.
      • The hosts are set as "optimized" on the system in Site A and as "non-optimized" on the system in Site B.
    • Hosts in site B can access both Active-Active peers on the local and remote InfiniBox systems.
      • The hosts are set as "optimized" on the system in Site B and as "non-optimized" on the system in Site A.
    • Paths are set to optimized for the local peer and non-optimized for the remote peer respectively. 
  • 8 VMs, divided to two VM groups:
    • VM Group Site A: 4 VMs 
    • VM Group Site B: 4 VMs
  • 2 VM/Host rules:
    • VMs that are members of the "VM Group Site A" should run on hosts that are members of the "Host Group Site A".
    • VMs that are members of the "VM Group Site B" should run on hosts that are members of the "Host Group Site B".
  • VMs in group "VM Group Site A" reside on "Active-Active_Datastore1". 
  • VMs in group "VM Group Site B" reside on "Active-Active_Datastore2". 

All VMs that are housed on hosts in Site A and reside on datastore "Active-Active_datastore1" are powered-on. 

All VMs that are housed on hosts in Site B and reside on datastore "Active-Active_datastore2" are powered-on. 

Active-Active replication is set.

As each host can access both peers, it can see paths to both systems. 

  • The paths to the local peers are set to optimized, while the paths to the remote peers are set to non-optimized. 
  • Looking at the "Connectivity and multipathing" view would show the state of each path:
  • In this example there are 6 paths to each InfiniBox system, 12 paths in total.
  • Paths with the "Active (I/O)" are the optimized paths to the local peer.
    • As long as there are "Active (I/O)" paths available, All R/W IO will go through these paths which provide optimal performance. 
    • The "Active" (non-optimized) paths, which are the paths to the remote peer, will be used only in case all optimized paths are gone. 

The following example will explore the same a scenario of a complete access loss to a InfiniBox system in a specific site site, while all virtual machines are powered-on.

Failure

System-Site-B experienced a failure which caused a loss of access to the local Active-Active peers. (the peers on Site B of both Active-Active volumes).

Results

All the virtual machines on both sites stay online.

Looking at the datastore view from the hosts in Site B would show that both datastores are still accessible. 

  • Due to the fact they can still access the peer volume from the remote system.

Looking at the "Connectivity and multipathing" view would show that the paths to the peer on Site B are marked as "Dead" - cannot perform reads or writes I/O. 

Until the failure on the InfiniBox system in Site B is resolved, the VMs that are running on hosts in Site B are using the "non-optimized" paths which perform I/O directly to the remote InfiniBox system.

  • This would cause relativity larger latency.
  • In cases where the failure on Site B is expected to persist a long period of time, consider to migrate the VMs that are running on hosts in Site B to the hosts in Site A, in order to spare the I/O travel addition. 

Once the failure on the InfiniBox system in Site B is resolved, the peer will be re-synchronized and accessible again to the hosts on both sites. vSphere then would automatically use only the optimized paths to each of the datastore.  

How InfiniBox Handles Failures

InfiniBox has two mechanism to handle failures for Active-Active replication: Witness and preferred system.

  • If an InfiniBox system becomes unavailable, e.g. power outage of the entire site, the peer system will provide access to all the volumes. 
  • If the replication link between the systems fails, then datastores will continue to serve I/O on one of the systems. Each datastore has a property in InfiniBox that defines its preferred-system, which will remain online. 

InfiniBox Witness

The witness is an arbitrator entity residing in a 3rd site (separate from the two InfiniBox systems involved in Active-Active replication), that acts as quorum in case of replication link failures. The witness is a lightweight stateless software deployed as a VM. 

  • If the witness is down or inaccessible, replication link failure will result in InfiniBox systems keeping volumes online according to their preferred-system settings.

Preferred system 

Each volume that is undergoing Active-Active replication has a definition for preferred system, which the witness uses to make correct decisions. 

  • If the witness is not available to the systems, the decision on which side stays active will be done per replica based on the preferred system.

Storage Failover 

InfiniBox Active-Active replication failover is fully automatic and does not require any storage administrator intervention.

Storage Replication Resynchronization and Recovery

InfiniBox Active-Active recovery is completely automatic; no storage administrator intervention is necessary to trigger a re-sync and recover replication.

If the InfiniBox systems got disconnected, the replication will internally fallback to async mode. Once the connectivity between the systems recovers, synchronization jobs will start replicating the missing data to the lagging system. During this time, from disconnection and through the re-sync progress the Active-Active volumes on the synchronized system serve I/O operations, while the remote side will be in lagging state until all data is synchronized between the volumes. 

  • Once the volumes are nearly in sync, they will smoothly transition to Sync replication mode, with no I/O disruption. The host paths to the lagging side will be automatically restored, allowing the hosts to perform I/O operations through both systems again.

InfiniBox Components Failures  

The following table describes the InfiniBox storage accessibility in a different failure scenarios when using Active-Active replication:

ScenarioInfiniBox System-AInfiniBox System-BReplication LinkWitnessActive-Active Volumes Access
OptimalUPUPUPUPVolumes are available through both systems
Witness is downUPUPUPDownVolumes are available through both systems
Replication Link is downUPUPDownUP

Volumes are available through the preferred system

System-A is downDownUPUPUPVolumes are available through System-B
Both systems are downDownDownN/AN/AVolumes are not available

*Assuming that the Active-Active replicas were in a "Synchronized" state at the moment of failure and the systems link is in witness resiliency.

Was this article helpful?
0 out of 0 found this helpful

0 out of 0 found this helpful

Comments