Saturday, September 27, 2008

Power Vs (LVM Vs VxVm). Score 2-0. Power Wins :(

we had an unplanned/unscheduled power outage recently and as a result of that, some servers went down, some storage controllers went down, some switches were up and some local disks went bad.

chaos, you may think.
it was actually much more challenging than that. and every such situation in office gives a good learning experience. once the storage controllers, switches and power to serves were restored, we had this problem.

there is one each imaging server for HP-UX, Solaris and AIX environments respectively(NOT imaging as in picture/photos imaging, this is a hard disk imaging environment). we can also call it as network based installation technologies for each h/w architecture type. All these servers store the disk images in a centralised archive server(which was, in our case, the same as the imaging server for hp-ux). the storage came from a MSA with one LUN of 2 TB presented to this host.

the server wudnt boot up after the power outage from the local disk (which was mirrored). the kernel wud just dump core and reboot infinitely. I and my colleagues were not great hp-ux kernel debuggers, so we took the alternate steps:

1. Boot from mirrored disk (although we know that a mirrored disk doesnt give data redundancy cos if data on one disk is corrupted, it wud almost simultaneously corrupt data on the other disk) - FAILED. Problem cud be hardware (memory, cpu ... although leds were all green) or OS related.
2. since the server (rp*) had 3 slots for scsi disks, we attached a disk from another server on which hp-ux (version immaterial at this point) was installed and booted the failed server from this disk. System Booted: h/w is OK. what could be so wrong with the OS that it dumps so bad??? and a power outage caused this?!?
3. we dint know the OS environment on the disk that had gone bad(version, file system type, partition information).

core options we had were
hp-ux 11.23 or 11.31
LVM or VxVm.

thats a combo of four.

and if u screw up with the metadata(fs data), u r totally lost. not just the 2TB data, configs and other data from the local disk(which had no backup ;) ) wud simply be wiped out. so, the strategy was to recover the local disk data(somehow) and then to recover the data on MSA.

a quick look at docs said LVM and VxVm totally co-exist and are "aware" of each other. good news, also bad. good news, cos they are aware of each other and hence wud _probably_ not tamper with the meta-data wen u try to access them. bad news, cos they co-exist, and hence our number of combinations to be tried out is still 4.

with fingers crossed, and since the third disk which had currently booted the server had 11.31 with lvm, i thought i will try the lvm method.

run sam, go to disks and filesystems menu, locate ur disk. note down the hardware path.
go back to volume groups menu, and say "import volume group".
u get a msg quickly saying that LVM data is not found on disk. phew.
this could mean disk corruption OR that it is a vxvm disk.

try the command line tools that sam just used with the force options vgscan, vgimport and u get same error.
OK. so its a vxvm disk.(trying to be optimistic)

apparently, vxvm software is installed as part of every hp-ux installation(may be only internally for us, cos vxvm is a licensed tool) but not activated unless you use it for root dg(disk group). vxvm and lvm have different terminologies.

LVM / VxVm
Physical Volume / VxVM Disk
Logical Volume / Volume
Volume Group / Disk Group
Physical Extent / Subdisk
LVM metadata / Private Region
Unused Physical / Free Space

LVM, as u see, is what we are mostly exposed to(due to linux and traditinal unix fs). so u know the terminologies and the equivalents. now, u need to know how to use the software itself (vxvm).
its quite easy. just run vxinstall. and use all default options. vxvm has a web based client for a front-end. the tool is called "vea" (Veritas Enterprise Administrator). vea just showed the physical disks. nothing else. all other entities(DGs, volumes) were empty. right-click on ur disk and say "recover disk" (with crossed fingers, ofcourse). and voila, the disk groups were imported. it was, a vxvm disk. root, swap and stand volumes were shows as detected, but corrupted. okay. so you run a fsck on the disk, correct all vxfs errors. and mount the root volume. ur config data and other data is there.

now for the data on the MSA. right click on msa disk and said "recover disk", just like earlier. and bam! i got a disk group that showed as 2TB free space!!! where is all the data!

here is where some concepts learnt in college helps. u most certainly know that u have only screwed up the meta-data and not yet the actual fs data. cos a disk was only initialised to be of some type(vxvm disk) and not formatted.

u go back to ur root disk and search for /mount-point/etc/fstab (damn, why dint i look into it before!!!) and it said, very promptly, that the disk was of LVM type (device names of vxvm disks are /dev/vx/dg-name/volumes, where as lvm will have it as /dev/vg-name/lvol-names). who wud think that the msa will be configured for lvm type and the root disk will be of vxvm type??? confirm it by running "strings /mount-point/etc/lvmtab" , the disk was sitting pretty there with the vg data.

ok, now, deactivate the disk using vea and delete all vxvm data. go back to sam and say "import volume group". as expected, lvm data was lost, so sam says, do u want to create a new volume group instead??? (hell, no!)

since i had access to the root disk, navigate to /mount-point/etc/lvmconfig/ and search for .conf file that will have the vg data.
using that, vgcfgrestore and vgimport commands, we cud restore the vg config. but the superblock was also corrupted(fsck said so). so had to dump the superblock from 8192 and successive(forgot the block count) blocks. fsck again and mount the lvol. phew, the data was all there!

now for the anti-climax:
even though we cud recover all data, we cudnt boot off the local disk that was dumping core. vxvm used to hang during boot-up right after initialising the disks. we had to re-setup the imaging server(we also tried using the "last_install" kernel and that wudnt boot either). a similar thing happened on an x86 VM (running on ESX 3.5)recently, which was also a critical server for us. even though fsck gave no errors and we cud access all data when mounted as an external disk on another VM, we could not boot from that disk. and thats when i give up and create the setup once again from scratch :( using the data recovered externally. and to top it all, what has power outage got to do with this???

PS: most of this might not be clear due to the language and the sequence in which i have put it. leave ur quetions as comments.


Shantanu said...

These experiences relate to the kick a doctor gets when he does a by-pass operation ( No way comparing engineering field with a perceived divine profession).

Well yes ... there are some jargons which have the technically specific.

A basic question: Now hypothetically if Kaveri basin dries up and the company decides to get onto the nuclear bandwagon , then during the switch is the system still prone to the problem you faced?

Also with my limited googling skills was not able to locate "sam"
Please explain madi :)

Girish said... for SAM. its like smitty on AIX or "smc" in solaris. menu driven admin tasks can be done. essentially runs the command line tools in the background and gives results...

i guess we can only have a risk factor associated with power problems. MOST machines that day came up without any problems and we had abt 1% of machines complaining of boot problems (out of 2k servers) and if u r unlucky, it can be one of the critical setups too!