At work we have and IBM Bladecenter chassis with several HS20 blades running RHEL4 connected to a EMC Clariion CX700 storage array. These blades have no internal storage and thus do Boot-from-SAN and obviously have their root volume located on the SAN as well. Back when these systems were running RHEL3 I had hacked up a mkinitrd based on a Whitepaper from EMC that would include PowerPath within the initial ramdisk so that the root volume was also protected from path failures. This worked reasonably well for many years but eventually we decided to migrate to RHEL4 and move the systems to dm-multipath. Unfortunately dm-multipath over root is not supported by RHEL4 (I've been told that this is not an issue in RHEL5 but haven't tested it yet) so I set off to once again "do my own thing". How hard could it be anyway?
(Updated Sep 22nd, 2007)
I've updated the mkinitrd.mp script to add support for x86-64 systems. Many thanks go to Markku Tavasti for both reporting this bug, providing lots of logs for debugging, and pointing out the obvious omission of the 64-bit libraries, and finally testing the updated script. The updated script also includes more verbose output to assist in debugging multipath setup failures during the initrd process.
RHEL5 Info -- I've received quite a number of queries regarding RHEL5 and root-on-multipath. I finally found a few minutes to test RHEL5 and it's root-on-multipath support. While it's much easier to install RHEL5 on a multipath system using the "linux mpath" option during the initial install, my current opinion is that RHEL5 is not ready for primetime when it comes to root on multipath using only the builtin functionality. Yes, it will work, but at this time I believe that there are some serious issues and limitations surrounding the support of root on multipath with RHEL5. Look for a full writeup soon.
*** Please note: The document below probably still needs cleanup, but I've had enough success reports to feel pretty confident that it is a reasonable guide to getting root-on-multipath working with RHEL4 assuming the user has a decent knowledge of Linux admin and is willing to put a little effort into understanding how linux multipath works. Please continue to feel free to report any errors and/or corrections. ***
The Goal
The goal is to get at RHEL4 system with no local storage to do Boot-from-SAN and then mount the root volume over native dm-multipath. This will hopefully allow the system to be robust against path failures. Using RHEL4 native functionality it is quite easy to get dm-multipath working on non-root volumes, but root isn't a possibility. This is OK for systems that have a local RAID disk to boot from, but doesn't work well for blades and other Boot-from-SAN configuration.
The Gotchas
OK, before I get started I'll confess to a few things. First, this is probably not a "Howto" in the strict sense of the word. Expressed more accurately this is a quick outline of what I did to get dm-multipath working on the root volume of my RHEL4 systems running on IBM HS20 blades against our EMC Clariion CX700 storage array at work. Expect to do some tweaking to get this work on your own system.
Second, I can't take much credit for most of this. In the spirit of open source I basically just did a search on Google and eventually found a mailing list archive with enough instructions to get my system working. Most of the credit for this should probably go to Darrly Dixon and his posting on the dm-devel list in Sept of 2006. You can read his posting here. Still, his instructions only described what it took to get it working by hand, and I wasn't interested in reproducing all of those steps every time a new kernel came out so I set out to create a mkinitrd script that would create a working initrd for mutlipath on root.
Third, the steps presented here are a HACK at best. The mkinitrd.mp script is hardcoded to copy all dm-multipath modules whether you need them or not, and I'm sure it's overkill. Also, the system copies several files which are not static binaries thus I was forced to copy libc.so.6 and ld-linux.so.2 into the initrd as well. This creates a bloated initrd, but it was quick and easy, and seems to work, so I didn't really care.
Forth (wow, these things just keep coming don't they), I've tested this on our setups and it seems to work well, but we use simple partitions for our root volumes, not LVM. I don't actually think the steps below do anything that will break root on LVM, and actually, it may even be easier in that case, but I've never tested it, and probably never will, so if your system is installed with an LVM root then you'll almost certainly need some tweaks to this document and procedure.
The Process
- You need to start off with a working, bootable system. In other words, RHEL4 should be booting from a SAN and the root volume should be mounted from the SAN, just without a dm-multipath device. In our case we typically simple install the OS just like it was a local drive, on /dev/sda and we usually end up with a layout as follows:
/dev/sda1 -- /boot
/dev/sda2 -- /
/dev/sda3 -- swap
/dev/sda4 -- /var
/dev/sda5 -- /usr
Now your system obviously probably won't be exactly like this, but these steps are based on this layout so if your system is different you need to make the appropriate changes as you work through these steps
- Configure your multipath.conf file. I'm not going into the details of the multipath.conf file here, if you don't understand RHEL4 multipath setup, this isn't the right HOWTO for you but the Redhat KB has a decent quickstart here to get you started. If your root volume is setup on a LVM device you may not need to do anything other than what's in the KB article before you procede with the rest of the steps here.
Unfortunately, running the multipath command typically won't setup multipath on the root device since the /dev/sda device will already be busy (since it's already mounted) but you'll almost certainly want some type of friendly name for your multipath device (well, at least I do). Yes, you can use the "user_friendly_names yes" option but having a device called /dev/mapper/mpath0 isn't much better than /dev/mapper/3600601f9600e000066d6f5b17e9fd811 as I still don't know what device /dev/mapper/mpath0 is so I usually create alias files in my multipath.conf file. To do this you will need to know the WWID of your root device (in my case /dev/sda). The easiest way to find out this information is to run the following command:
# scsi_id -p 0x83 -g -s /block/sda
3600601f9600e000066d6f5b17e9fd811
Then I simply add a section to my multipath.conf file that looks like the following:
multipaths {
multipath {
wwid 3600601f9600e000066d6f5b17e9fd811
alias os
}
}
So now, instead of /dev/mapper/3600601f9600e000066d6f5b17e9fd811 my dm-multipath device will be called /dev/mapper/os and thus my volumes will be changed as follows:
/dev/mapper/os1 -- /boot
/dev/mapper/os2 -- /
/dev/mapper/os3 -- swap
/dev/mapper/os5 -- /var
/dev/mapper/os6 -- /usr
- OK, so now hopefully you have a working multipath.conf config. Now we need to make a change to the udev rules. The easiest thing to do is simply replace the /etc/udev/rules.d/40-multipath.rules file with the one I have here. You'll probably want to make a backup of the original rules just in case. Also, note that if you upgrade the device-mapper-multipath RPM this will likely overwrite this file.
NOTE: I actually think I only need the modified udev rules on the initrd image so I could probably just copy it from a custom location, or even just create it within my mkinitrd.mp script below. I'll try to test that soon.
- Now simply download my customized mkinitrd file and place it in a good spot (mine is in /usr/local/sbin). To build a dm-multipath capable initrd simply run this script instead of the normal mkinitrd. As an example I used a command like this:
# mkinitrd.mp /boot/initrd-2.6.9-42.0.3.ELsmp-mp.img 2.6.9-42.0.3.ELsmp
I always make an initrd called something slightly different from the normal name so that I can have two grub boot entries, one with the standard initrd image (without multipath) and one with the dm-multipath enabled initrd.
- Modify your /etc/fstab to mount the new dm-multipath files (i.e. replace /dev/sda2 with /dev/mapper/os2)
- Modify your /boot/grub/grub.conf file to call the new initrd image. I usually just duplicate one of the working grub.conf entries and then modify it to use the new initrd image. An example from one of my systems is here:
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
default=1
title Red Hat Enterprise Linux ES (2.6.9-42.0.3.ELsmp)
root (hd0,0)
kernel /vmlinuz-2.6.9-42.0.3.ELsmp ro root=/dev/sda2
initrd /initrd-2.6.9-42.0.3.ELsmp.img
title Red Hat Enterprise Linux ES w/dm-multipath-on-root (2.6.9-42.0.3.ELsmp)
root (hd0,0)
kernel /vmlinuz-2.6.9-42.0.3.ELsmp ro root=/dev/mapper/boot2
initrd /initrd-2.6.9-42.0.3.ELsmp-mp.img
- So now, reboot and good luck. If everything works your system will boot with the new initrd and load dm-multipath before the root volume is mounted. The new root volume should be mounted via the dm-multipath device (in my example /dev/mapper/os2) and you can go about testing the system to see if it survives path failures.
Friday, March 23. 2007 at 06:05 (Reply)
I would like to test the script. Where's the link?
Friday, March 23. 2007 at 10:31 (Link) (Reply)
Thanks for pointing out how difficult that link was to see, I knew it was there and had a difficult time finding it.
Thanks,
Tom
Sunday, May 20. 2007 at 12:27 (Reply)
Could you put the original document you used on RHEL3 also online?
Monday, May 21. 2007 at 14:59 (Reply)
Later,
Tom
Monday, May 21. 2007 at 20:26 (Link) (Reply)
Have you started to look at the changes needed for Red Hat 5 (CentOS 5)? The mkinitrd script has had major changes.
If possible, we would like to jump to version 5 sooner than later.
Thanks for all you hard work on this project. We really appreciate your sharing this information with us.
Monday, May 21. 2007 at 22:47 (Reply)
It was widely reported/rumored that RHEL5 was going to support this functionality by default but even if not I expect that the changes would be minimal. I have booted Fedora Core 6 from the SAN with multipath and RHEL5 is a pretty close cousin to FC6.
Hopefully I'll have time in the next few weeks, but if you beat me to it be sure to post your results.
Later,
Tom
Tuesday, May 22. 2007 at 08:35 (Link) (Reply)
Friday, July 27. 2007 at 16:35 (Reply)
I'm trying to get DM to work with my Hitachi USP600, were using the IBM HS21 and booting from the SAN. I've followed all of your instructions for implementing the multipath for root, but can't seem to get DM to create the /dev/mapper/mpath0p1.
I've got two LUN's assigned to the server and the second is recognized and created /dev/mapper/mpath1p1(non root). The driver is loading on boot as directed from the new kernel img. Any trouble shooting tips you could provide?
Thanks,
Edward Moscardini
Friday, July 27. 2007 at 23:00 (Link) (Reply)
You've hit the trickest part of the conversion, and your right, you cannot create the mpath device when it's already booted.
You need to edit the multipath.conf file and give it an alias for the root LUN as my example shows. Then you need to rebuild your initrd and modify your grub.conf to attempt to mount the multipathd volume.
If you can't get it to work you can email me your multipath.conf and grub.conf and I can see if anything looks obviously wrong.
Unfortunately I'm out of town for the next 10 days or say with only limited connectivity so my response might be slow.
Good luck.
Later,
Tom
Thursday, September 13. 2007 at 06:38 (Reply)
But after reboot ... dont reboot it's search / SWAP ....all my partitions ....
So after install RHEL 5 with the command :lunix mpath, where I can modify the policy of mpath ?
THANKS
Saturday, September 22. 2007 at 17:25 (Reply)
Once again, RHEL5 is, IMO, not ready for primetime with regards to multipath on root. The current t support is very crude and prone to failure.
Later,
Tom
Monday, December 17. 2007 at 05:34 (Link) (Reply)
We're runnig a few HP blade servers booting from SAN (HP EVA 8000) with 5.1, but before making it mainstream prodcution I would like to have your opinion with regards to your primetime statement.
Cheers,
Andre
Monday, December 17. 2007 at 13:07 (Reply)
In looking at the mkinitrd script in RHEL5 it appears that Redhat has implemented a true, dynamic multipath setup within the initrd image similar to what my hacked script did for RHEL4. I believe this corrects the major issue with regards to a very fragile boot process and makes the Boot-from-SAN configuration fairly robust.
That being said, I haven't tested it so I can't say for sure. It certainly looks like they "do the right thing" in 5.1, so I think you would be OK, but I'd test it as much as possible including booting with failed paths and, if your array allows, changing the LUN ID's to simulate changes in device discovery order, etc.
Let me know how it works.
Friday, July 11. 2008 at 17:11 (Link) (Reply)
We are actually trying to do the exact same thing with our SAN environment. The only difference is that we're using Dell servers and an FC4700-2. We've hit a little bit of a road block, and we could use some help. Email me at gabe_zschach83@yahoo.com.
Thanks,
Gabe
HIS INC
Wednesday, January 30. 2008 at 08:06 (Reply)
The mkinitrd is able to load dm-multipath and dm-round-robin in the initrd in this release... However when booting all goes well to the point where root is switched...it simply can't find the mapped device...
When building the initrd with the -v option I see the modules getting loaded into the initrd, as well as one path map, that's one of the other luns I zoned in... The path map of the root device however is not loaded...
Both the grub.conf line and the fstab root entry point to the multipath device but it isn't available...
Did you do something special in your initrd to load the path of the root multipath device as well ?