Thursday, May 6, 2010

What is MPIO and Best Practices of MPIO configuration

I got numerous requests to cover Multipath IO [MPIO] on my blog so I decided to write about how to configure and best practices of MPIO configuration. Before we jump on those I will just define MPIO for starters-- MPIO provides the logical facility for routing I/O over redundant hardware paths connecting server to storage. These redundant hardware paths are made up of components such as the cabling, Host Bus Adapters (HBA’s), Switches, and Storage Controllers and possibly even power. MPIO solutions logically manage these redundant connections so that I/O requests can be rerouted in the event that a component along one path fails. MPIO is a Microsoft-provided framework that allows storage providers to develop multipath solutions that contain the hardware specific information needed to optimize connectivity with their storage arrays. These modules are called Device Specific Modules (DSM’s). In windows 2008 and later release OS ships with an integrated MSDSM.sys which can be used if storage arrays support either: Active/Active or Asymmetric logical unit access (ALUA) configuration.

So let’s say I have 2 redundant paths for my lun…if I do not use MPIO software [enable MPIO role in server manager and configure MPIO ] OS will see 2 luns in device manager and disk management.msc though actually there is only one lun being provisioned but from 2 different storage paths. Now if I start writing on these 2 luns it will be a problem for NTFS file system and hence the need for MPIO software. The MPIO software prevents data corruption by ensuring correct handling of the driver associated with a single device that is visible to the operating system through multiple paths. Data corruption is likely to occur because when an operating system believes two separate paths lead to two separate storage volumes, it does not enforce any serialization or prevent any cache conflicts. Consider what would happen if NTFS tries to initialize its journal log twice on a single volume. The vendor based DSM [more functionality as the vendor knows its storage and has more control and features] or MS based DSM provides failover and load balancing capability for these redundant storage paths and make sure that OS only see one path i.e. a pseudo device.

So now when we know why we need MPIO..let’s take the next step..enable MPIO feature—reboot---go to start-administrative tools—MPIO—Discover Multi-Paths tab runs an algorithm for  every device instance that is present on the system and determines if multiple instances actually represent the same LUN (through different paths). For such devices found, their hardware ids are presented for the Admin for MPIO’ing (they’ll get MSDSM support though).----on MPIO devices tab—it shows the hardware ID’s of devices that will currently be managed by MPIO whenever they are present. The decision is based on their hardware id (ie. Vendor+Product string) matching one that is maintained by MPIO in its MPIOSupportedDeviceList (this is something that every DSM specifies in its INF at the time of installation).


Many storage arrays which are active/active and SPC-3 compliant will work using the Microsoft MPIO DSM.  Some storage array partners also provide their own DSMs to use with the Microsoft MPIO architecture. These DSMs may be installed using the DSM Install tab in the MPIO properties. When vendor package is run on Windows Server 2008, the operating system allows the vendor DSM to be installed, but prevents the MPIO binaries already present in Windows Server 2008 from being updated by the package. Thus the end result of running the vendor installation package is to preserve the core MPIO binaries already present in Windows Server 2008, but the 3rd party provided DSM is added. Remember that you do not need to install MSDSM driver.

Very cool..so let’s go to device manager and see the changes..

So you see for my local boot device..in device manager it shows scsi device [no multi path enabled for this disk and nor it is being managed by DSM software] however for my other luns that I have assigned to server via multiple paths..i see them as multi path disk device…right click on anyone of them….go to MPIO tab…you will see the DSM name..so this DSM is having handle on disk and i.e. MSDSM.sys….so this is an easy way to identify which DSM is managing the disk, if you have multiple DSM softwares on server especially in case when you have luns being assigned from different storage arrays. You can also see other details including MPIO policy which we will talk later.

PathVerifyEnabled: This flag enables path verification by MPIO on all paths every N seconds (where N depends on the value set in PathVerificationPeriod).

PathVerificationPeriod: This setting is used to indicate the periodicity (in seconds) with which MPIO has been requested to perform path verification. This field is only honored if PathVerifyEnabled is TRUE.

RetryCount: This setting specifies the number of times a failed IO if the DSM determines that a failing request must be retried. This is invoked when DsmInterpretError() returns Retry = TRUE.
The default setting is 3.
PDORemovePeriod: This setting controls the amount of time (in seconds) that the multipath pseudo-LUN will continue to remain in system memory, even after losing all paths to the device. 
So lets see what really happens @ MPIO stack and how device driver stack walks through in discovering, enumerating and grouping the physical devices, device paths into a logical set. (Assuming a scenario where a new device is being presented to the server)

  1. New device arrives.
  1. PnP manager detects the arrival of this device.
  1. MPIO driver stack is notified of this device arrival (it will take further action if it is a supported MPIO device).
  1. MPIO driver stack creates a pseudo device for this physical device.
  1. MPIO driver walks through all the available DSM’s to find out which vendor specific DSM can claim this device. After a DSM claims a device it is associated only with the DSM that claimed it.
  1. The MPIO driver, along with the DSM, makes sure the path to this device is connected, active, and ready for IO.
If a new path for this same device arrives, MPIO then works with the DSM to determine whether this device is the same as any other claimed device. It then groups this physical path for the same device into a logical set called multipath group. 

 MPIO has different failover and load balancing options and these may differ in case of MSDSM and vendor based DSM.

Failover, where no load balancing is performed. The DSM will specify a primary path and a set of standby paths. Primary path is used for processing device requests. If the primary path fails, one of the standby paths will be used.. Any one of the available paths could be used as primary path, and the remaining paths will be used as standby paths.

Note: With an array that supports ALUA, paths will typically be referred to as Active / Optimized and Active Unoptimized rather than only as a Primary path.

Failback is the ability to dedicate I/O to a designated preferred path whenever it is operational. If the preferred path fails, I/O will be directed to an alternate path, but will automatically switch back to the preferred path, with some DSM assistance when it becomes operational again.

Round Robin Load balancing where the DSM will use all available paths for I/O in a balanced, round robin fashion. This is the default policy chosen when the storage controller follows the true Active-Active model and the management application does not explicitly choose a load balancing policy.

Round Robin with subset of paths is load balancing wherein the application will specify a set of paths to be use in Round Robin fashion, and a set of standby paths. The DSM will use paths from primary pool of paths for processing requests as long as at least one of the paths is available. The DSM will use a standby path only when all the primary paths fail. For example, given 4 paths – A, B, C, and D,  A, B, and C are listed as primary paths and D is standby path. The DSM will choose a path from A, B, and C in round robin fashion as long as at least one of them is available. If all three fail, the DSM will start using D, the standby path. If A, B, or C become available, DSM will stop using D and switch to the available paths among A, B, and C.

Dynamic Least Queue Depth Load Balancing where the DSM will route I/O to the path with the least number of outstanding requests.

Weighted Path load balancing where a weight is assigned to each path; the weight indicates the relative priority of a given path. The larger the number the lower the priority. The DSM will choose a path, among the available paths, with least weight.

Not all errors result in failover to a new path. Some errors are temporary and can be recovered using a recovery routine in the DSM; if recovery is successful, MPIO is notified and path validity checked to verify that it can be used again to transmit I/O requests.  When a fatal error occurs, the path is invalidated and a new path is selected. The I/O is resubmitted on this new path without requiring the Application layer to resubmit the data.

Ok so now lets go back to MPIO settings see how a vendor or MS based DSM selects its device list. For a vendor based DSM software this is in the inf file which is used during the installation of the vendor dsm driver and all the supported devices present on the server will be seen in registry too. so for a server where I have luns assigned from multiple different storage arrays..i will have multiple vendor DSM’s installed and as soon as a new storage object arrives… MPIO driver walks through all the available DSM’s to find out which vendor specific DSM can claim this device… …[Microsoft DSM is contacted last in list of DSM providers ]….After a DSM claims a device it is associated only with the DSM that claimed it and problem is that lets say you have 2 DSM driver who will support the device….anyone of those can put a handle on disk and the other one will not even get a chance….and that’s why It is strongly recommended that the hardware ID for a specific disk device should be configured such that it is only associated with one DSM in the services key. This will help ensure that the device is only available to be claimed by the desired DSM, and will help avoid a situation where a disk is not claimed by the desired DSM, when multiple DSM’s have the ability to support a given device. This process may also be used to determine which DSM a disk drive could be associated with for troubleshooting purposes. Warning: Removing hardware ID’s from DSM’s that are not required should be performed by editing the registry keys directly, as removing a device from the MPIO GUI will result in the hardware ID being removed in both registry locations above and prevent the device from being claimed by the desired DSM.

When configuring a disk device to be managed by MPIO for multipath access, the hardware ID for the disk device is required to be present in two different locations in registry in order to be claimed by MPIO and the DSM managing connection to the device. These two locations are:

HKLM\System\CurrentControlSet\Control\MPDEV\MPIOSupportedDeviceList
AND
HKLM\System\CurrentControlSet\Services\\Parameters\DsmSupportedDeviceList



Last but not the least…the very nice feature of seeing all settings is configuration.log from MPIO-configuration snapshot tab


You can also configure MPIO with iscsi provisoned luns and inbox MSDSM  will take care of the luns if you have not installed any vendor based DSM from your iscsi storage array vendor. you can also use command line tool Mpioclaim for configuring especially when you are on server Core.


May be in future blog I will try to cover more on ISCSI boot capabilities. Hope you find this information useful and yes you can also capture MPIO trace but you will need to open support call with MS PSS to get it analyzed. Please refer MPIO step by step guide for more information and thanks for your time and hope you liked this article.

GAURAV ANAND

17 comments:

  1. Disk Properties - MPIO Tab

    I have two path ID's. How do I decipher the "Path ID". I know my two different paths and which one is optimized vs. un-optimized but I can't figure out how to tie those to what Windows calls a "Path ID"

    ReplyDelete
  2. Should I get this MPIO Tab in Windows 2003?

    ReplyDelete
  3. Dear,

    Congratulations for Post!

    I have one questions!

    What is the best practices apply policy load balance in File server with Windows 2008 R2?

    Fail Over only
    Round Robin
    Roud Robin with subset
    Least Queue Depth
    Weignted Paths

    Tks a lot.

    ReplyDelete
  4. Anonymous: If this is an iSCSI SAN: The Path IDs you see in the MPIO Tab matches the one founds in: iSCSI Initiator - Targets - Select a connected target - Properties - Devices - MPIO. You should see the same Path IDs. click on each one and select Details

    Fabio Macchia: The policy depends on the array. The manufacture will tell you which one is supported.


    GAURAV ANAND: Nice article!congrats

    ReplyDelete