Using alt_disk_copy :
Cloning rootvg
Most system
administrators have experienced the following scenario:
- A failed
ML upgrade.
- It's
getting to the end of the day.
- You cannot
fix it.
- It's too
late to get it resolved by third-party support.
- You need
to back out.
Typically, this
situation requires a rootvg restore, whether it uses a tape mksysb restore or a
network boot restore. There is no doubt it is a pain! Using the alt_disk_copy
method to take a copy of your rootvg only requires the time it takes to do a
reboot to recover your rootvg to the pre-upgrade event. This article
demonstrates how to implement alt_disk_copy when applying an AIX upgrade and
how to recover rootvg. alt_disk_copy can also be used for testing two different
versions of AIX. You simply upgrade one disk then boot off it, and when you
need to go back to the other version, simply boot off that disk instead.
Indeed, the alt_disk_copy is often used to clone the rootvg to a spare disk for
regular on-line backup of rootvg. It can also be used as a hardware migration
tool of rootvg.
This article
focuses on a typical rootvg two-disk software mirror set-up. However,
alt_disk_copy is not restricted to this two-disk set-up; the same principles
apply to multiple software mirroring situations.
The alt_disk
utilities consist of the following commands:
- alt_disk_copy performs disk cloning.
- alt_rootvg_op performs maintenance operations on the clone
rootvg.
- alt_disk_mysysb performs a mksysb copy.
This
demonstration does not discuss alt_disk_mysysb.
The filesets
required for the alt commands are:
bos.alt_disk_install.boot_images
bos.alt_disk_install.rte
bos.msg.en_US.alt_disk_install.rte
|
Because the
alt_disk_copy command takes a copy of the current running rootvg to another
disk, be sure to have all the file systems mounted that you want cloned across.
alt_disk_copy only copies the currently mounted file systems in rootvg. There
is no need to stop processes to execute alt_disk_copy; however, this process
can take some time, so it is best to do it at lunchtime or in the evening
(remember it is taking a running copy). Once the copy has completed, you will
be presented with two rootvg volume groups:
rootvg
altinst_rootvg
|
where altinst_rootvg is the cloned
non-active/varied off rootvg. The cloned rootvg has all its logical volumes
prefixed with the name 'alt'. The boot list is also changed to boot off altinst_rootvg.
AIX likes to do things like this; it assumes you will want to boot off the
cloned and not the real rootvg. If the system is now rebooted and when the
system comes back up, the original rootvg will become:
old_rootvg
|
The original
altinst_rootvg becomes:
rootvg
|
If you decide
to reboot off the old_rootvg, when the system comes back up, the old_rootvg
becomes:
rootvg
|
The rootvg
becomes:
altinst_rootvg
|
Do not worry
about the renaming of the original and cloned rootvg. I will demonstrate this
shortly.
With a
successful completion of an upgrade, the disk containing the cloned rootvg can
then be destroyed using the alt_rootvg_op and mirrored back in. If the upgrade
event has gone disastrously, there is no real problem--simply take a snapshot
for third-party support, then boot off the good rootvg. For users to log in, it
is business as normal.
When you get a
response back from support on the fix, during off-line hours, simply reboot off
the cloned rootvg and fix the issue. There is no need to go through the
time-consuming tasks of re-applying the upgrade because you already have it on
the cloned rootvg. Get the upgrade tested, and if it is all OK, destroy the
cloned rootvg and mirror back in.
Do not use
importvg or exportvg on the clone rootvg; use the alt commands instead.
With the cloned
rootvg, you can mount the file systems by waking up the disk using
alt_rootvg_op. Doing whatever works is required on the cloned file systems, and
one would assume here to fix a patch of link, or gather information for
third-party support, then put the disk back to sleep, which will also unmount
the file systems.
When cloning,
you can exclude certain directories by creating the file: /etc/exclude.rootvg. The entries
should start with the ^ /. characters. The '^' means to search for the string
at the beginning of the line and the './' means relative to the current
directory. You are advised to do this so alt_disk_copy does not misinterpret
the command, as it uses grep to search for the string. So, make sure you
provide the full pathname, prefixed with '^.' , for example, to exclude the
following directories:
/home/reps
/opt/installs
|
I could insert
into the /etc/exclude.rootvg file:
^./home/reps
^./opt/installs
|
Make sure there
are no empty lines after the last entry.
Let's now go
through a typical clone. Assume you have a software two-disk (hdisk0 and
hdisk1) mirror of rootvg, and further assume that you are going to do a ML (or
application upgrade, assuming it is installed in rootvg) upgrade on this
system. I will demonstrate one way this can be done to clone the disk and after
a successful upgrade will bring the disk back into rootvg and re-mirror. I will
also demonstrate the actions you can take if the upgrade fails.
Before
unmirroring the rootvg, first take some time to ensure you are correctly
mirrored and have no stale LV's, because if you do, the unmirrorvg will fail.
Of course, you could always do a migratepv to move the missing LV's across if
the unmirrorvg fails. A simple method to check that you are mirroring is to
issue the command:
lsvg -l rootvg
|
For each row of
data output, check that the output of the PPs column is double that of the LPs
column.
Another method
to check to see if you are mirroring is to use: lspv -l
<hdiskx> and compare
the output to make sure you have entries for each LV on both disks.
Next, issue the
bosboot command. I personally always do this prior to either rebooting or disk
operations involving rootvg; it is a good habit to have:
# bosboot -a
bosboot: Boot image is 35803 512 byte
blocks.
|
A listing of
the disks being used for this demonstration is as follows:
# lspv
hdisk0 0041a97b0622ef7f rootvg active
hdisk1 00452f0b2b1ec84c rootvg active
|
Next, unmirror
rootvg and take the disk that is going to be used for cloning out of rootvg.
This demonstration uses hdisk1 to clone rootvg, so issue the unmirrorvg
command:
# unmirrorvg rootvg hdisk1
0516-1246 rmlvcopy: If hd5 is the
boot logical volume, please run 'chpv -c <disk
name>'
as root user to clear the boot record
and avoid a potential boot
off an old boot image that may reside
on the disk from which this
logical volume is moved/removed.
0516-1804 chvg: The quorum change
takes effect immediately.
0516-1144 unmirrorvg: rootvg
successfully unmirrored, user should perform
bosboot of system to reinitialize
boot records. Then, user must modify
bootlist to just include: hdisk0.
|
Next, take
hdisk1 out of rootvg in readiness for the cloning:
# reducevg rootvg hdisk1
|
Confirm that
the disk is now not assigned to any volume groups:
# lspv
hdisk0 0041a97b0622ef7f rootvg active
hdisk1 00452f0b2b1ec84c None
|
Now you are
ready to issue the alt_disk_copy. Simply supply hdisk1 as a parameter to the
command. The basic format is:
alt_disk_copy -d <hdisk to clone
rootvg>
|
To use an
exclude list, the basic format is:
alt_disk_copy -e /etc/exclude.rootvg
-d <hdisk to clone rootvg>
|
The following
output from the alt_disk_copy command has been truncated:
# alt_disk_copy -d hdisk1
Calling mkszfile to create new
/image.data file.
Checking disk sizes.
Creating cloned rootvg volume group
and associated logical volumes.
Creating logical volume alt_hd5
Creating logical volume alt_hd6
Creating logical volume alt_hd8
Creating logical volume alt_hd4
Creating logical volume alt_hd2
Creating logical volume alt_hd9var
Creating logical volume alt_hd3
Creating logical volume alt_hd1
Creating logical volume alt_hd10opt
Creating /alt_inst/ file system.
Creating /alt_inst/home file system.
Creating /alt_inst/opt file system.
Creating /alt_inst/tmp file system.
…......
…......
for backup and restore into the
alternate file system...
Backing-up the rootvg files and
restoring them to the
alternate file system...
Modifying ODM on cloned disk.
Building boot image on cloned disk.
forced unmount of /alt_inst/var
forced unmount of /alt_inst/usr
forced unmount of /alt_inst/tmp
forced unmount of /alt_inst/opt
forced unmount of /alt_inst/home
…..
…..
Changing logical volume names in
volume group descriptor area.
Fixing LV control blocks...
Fixing file system superblocks...
Bootlist is set to the boot disk:
hdisk1
|
At this stage,
you now have a cloned rootvg called altinst_rootvg. Notice in the previous
output alt_disk_copy has changed the bootlist to boot off the cloned rootvg,
which is now hdisk1.
# lspv
hdisk0 0041a97b0622ef7f rootvg active
hdisk1 00452f0b2b1ec84c altinst_rootvg
|
This can be
confirmed by issuing the bootlist command:
# bootlist -m normal -o
hdisk1 blv=hd5
|
At this point
the ML upgrade can now be installed. After an ML upgrade you will need to
reboot the system. For this demonstration, the ML upgrade will be installed on
the real rootvg (that is hdisk0), so you need to change the bootlist now,
because you want the system to come up with the new upgrade running.
# bootlist -m normal hdisk0
|
Confirm the
change of the bootlist:
# bootlist -m normal -o
hdisk0 blv=hd5
|
Next, install
the ML upgrade, then reboot. After rebooting, the system presents the following
rootvg and cloned rootvg. As can be seen, no root volume group has been
renamed, because we booted off the real rootvg (hdisk0):
# lspv
hdisk0 0041a97b0622ef7f rootvg active
hdisk1 00452f0b2b1ec84c altinst_rootvg
|
Next let's assume
everything has gone OK on the upgrade and support users and the systems
administrator has signed it off with no issues found. The alt_disk_copy can now
be destroyed, and the disk brought back into rootvg for mirroring. Use the
alt_rootvg_op command with the X parameter to destroy the cloned rootvg. The
basic format is:
alt_rootvg_op -X < cloned rootvg
to destroy>
# alt_rootvg_op -X altinst_rootvg
Bootlist is set to the boot disk:
hdisk0
|
Next, extend
rootvg to bring hdisk1, and then mirror up the disk:
# extendvg -f rootvg hdisk1
# mirrorvg rootvg hdisk1
0516-1804 chvg: The quorum change
takes effect immediately.
0516-1126 mirrorvg: rootvg
successfully mirrored, user should perform
bosboot of system to initialize boot
records. Then, user must modify
bootlist to include: hdisk0 hdisk1.
|
Change the
bootlist to include both disks and run bosboot:
# bootlist -m normal -o hdisk0 hdisk1
hdisk0 blv=hd5
hdisk1
# bosboot -a
bosboot: Boot image is 35803 512 byte
blocks.
# bootlist -m normal -o
hdisk0 blv=hd5
hdisk1 blv=hd5
|
For this
demonstration, that's it: mission accomplished. The pgrade is installed with no
issues. The system is operational. That's pretty much how alt_disk_copy works
if all goes OK. But what if the upgrade fails? What options do you have? Let's
look at that next.
Let's now
assume you have just installed the ML upgrade and rebooted, and issues have
been found with the operational running of AIX. Remember, you currently have
the disks in the following state:
# lspv
hdisk0 0041a97b0622ef7f rootvg active
hdisk1 00452f0b2b1ec84c altinst_rootvg
|
At this point,
a snapshot should be taken of the running system, in readiness for third-party
support, for the call that you will undoubtedly log. Taking stock of the
current situation, you have:
- rootvg: with post-upgrade issues.
- altinst_rootvg : with good copy pre-upgrade.
To get back to
the pre-upgrade, simply change the bootlist to boot off the (altinst_rootvg)
hdisk1, then reboot. It's that simple:
# bootlist -m normal -o hdisk1
hdisk1 blv=hd5
# bootlist -m normal -o
hdisk1 blv=hd5
# shutdown -Fr
|
After the
reboot, you will be presented with the following rootvg disks:
# lspv
hdisk0 0041a97b0622ef7f old_rootvg
hdisk1 00452f0b2b1ec84c rootvg active
|
Next, issue a
bosboot and confirm the bootlist:
# bosboot -a
bosboot: Boot image is 35803 512 byte
blocks.
# bootlist -m normal -o
hdisk1 blv=hd5
|
The system is
now back to the pre-upgrade state.
At a convenient
time schedule that is agreed-upon with the end users, and with information
provided by third-party support, you can then boot off the ML failed upgraded
disk (hdisk0) and apply a fix that might solve the issue, so change the
bootlist to boot off (old_rootvg) hdisk0 and reboot:
# bootlist -m normal -o hdisk0
# shutdown -Fr
|
After the
reboot, in readiness to apply the fix, you will be presented with the following
rootvg disks:
# lspv
hdisk0 0041a97b0622ef7f rootvg active
hdisk1 00452f0b2b1ec84c altinst_rootvg
|
Next, apply the
fix or instructions on how to fix it have been carried out, and assume the
system is now operational again.
After the
system has been tested and signed off bring in hdisk1, use the commands
described earlier:
#
alt_rootvg_op -X altinst_rootvg
|
Bootlist is set to the boot disk:
hdisk0
# extendvg -f rootvg hdisk1
# mirrorvg rootvg hdisk1
bootlist -m normal -o hdisk0 hdisk1
hdisk0 blv=hd5
hdisk1
# bosboot -a
bosboot: Boot image is 35803 512 byte
blocks.
#
bootlist -m normal -o
hdisk0 blv=hd5
hdisk1 blv=hd5
# lspv
hdisk0 0041a97b0622ef7f rootvg active
hdisk1 00452f0b2b1ec84c rootvg active
|
Within a cloned
rootvg environment, you can wake up the cloned rootvg to be active. All cloned
file systems from the cloned rootvg will be mounted. It is quite useful because
you have a good running system, but at the same time mount the file systems
from the cloned rootvg for further investigation or file modification. When a
cloned rootvg is woken up, it is renamed to:
altinst_rootvg
|
Do not issue a
reboot while the cloned rootvg filesystems are still mounted, because
unexpected results can occur. You can also rename a cloned rootvg, which is
useful when you have more than one cloned rootvg.
Assume you have
the disks in the following state:
# lspv
hdisk0 0041a97b0622ef7f old_rootvg
hdisk1 00452f0b2b1ec84c rootvg active
|
To wake up a
disk, the basic format is:
alt_rootvg_op -W -d < hdisk>
|
Let's now wake
up old_rootvg (hdisk0):
# alt_rootvg_op -W -d hdisk0
Waking up old_rootvg volume group ...
|
Checking the
state of the disks, you can see the old_rootvg has been renamed to
altinst_rootvg and is now active.
# lspv
hdisk0 0041a97b0622ef7f altinst_rootvg active
hdisk1 00452f0b2b1ec84c rootvg active
|
The cloned file
systems have been mounted, with the prefix of /alt_:
# df -m
Filesystem MB blocks Free %Used Iused %Iused Mounted on
/dev/hd4 128.00 102.31
21% 2659 11% /
/dev/hd2 1968.00 111.64
95% 40407 58% /usr
/dev/hd9var 112.00 77.82
31% 485 3% /var
/dev/hd3 96.00 69.88
28% 330 3% /tmp
/dev/hd1 208.00 118.27
44% 1987 7% /home
/proc - -
- - -
/proc
/dev/hd10opt 1712.00
1445.83 16% 6984
3% /opt
/dev/alt_hd4 128.00
102.16 21% 2645
11% /alt_inst
/dev/alt_hd1 208.00 33.64
84% 1987 21% /alt_inst/home
/dev/alt_hd10opt 1712.00
1445.77 16% 6984
3% /alt_inst/opt
/dev/alt_hd3 96.00 72.38
25% 335 2% /alt_inst/tmp
/dev/alt_hd2 1968.00
100.32 95% 40407
59% /alt_inst/usr
/dev/alt_hd9var 112.00
77.53 31% 477
3% /alt_inst/var
|
At this point
file modification or further investigation can be carried out on the cloned
rootvg. Now you can access the cloned file systems. Once these tasks have been
carried out, put the cloned rootvg to sleep and in the same operation issue a
bosboot on that disk. The basic format of the command is:
alt_rootvg_op -S -t <hdisk>
|
Let's now put
the altinst_rootvg to sleep:
# alt_rootvg_op -S -t hdisk0
Putting volume group altinst_rootvg
to sleep ...
Building boot image on cloned disk.
forced unmount of /alt_inst/var
forced unmount of /alt_inst/usr
forced unmount of /alt_inst/tmp
forced unmount of /alt_inst/opt
forced unmount of /alt_inst/home
forced unmount of /alt_inst
forced unmount of /alt_inst
Fixing LV control blocks...
Fixing file system superblocks...
|
The current
state of the disks is now:
# lspv
hdisk0 0041a97b0622ef7f altinst_rootvg
hdisk1 00452f0b2b1ec84c rootvg active
|
From the above
demonstration, you can see the cloned rootvg name stayed the same: altinst_rootvg.
It is sometimes
good to go back to the original state of the disks to save confusion,
especially if you have more than one cloned disk. So rename altinst_rootvg back
to old_rootvg. The basic format is:
alt_rootvg_op -v <new cloned
rootvg name> -d <hdisk>
|
So in this
example, you would issue:
# alt_rootvg_op -v old_rootvg -d
hdisk0
# lspv
hdisk0 0041a97b0622ef7f old_rootvg
hdisk1 00452f0b2b1ec84c rootvg active
|
Of course, you
could rename the cloned rootvg to something more meaningful, if so desired.
# alt_rootvg_op -v bad_rootvg -d hdisk0
bash-2.05a# lspv
hdisk0 0041a97b0622ef7f bad_rootvg
hdisk1 00452f0b2b1ec84c rootvg active
|
You cannot
rename a cloned rootvg to altinst_rootvg; it is a reserved name.
From this
point, the system is now operational or not, depending on the success of the
fix, using the commands described earlier.
If the fix
worked on (old_rootvg) hdisk0, then run with the new ML version.
Confirm that
the disk will boot off hdisk0:
# bootlist -m normal -o hdisk0
|
Reboot:
# shutdown -Fr
|
Destroy the
newly cloned disk (we rebooted off old_rootvg; it now becomes altinst_rootvg)
hdisk1:
# alt_rootvg_op -X altinst_rootvg
|
Bring in hdisk1
into rootvg for mirroring:
# extendvg -f rootvg hdisk1
# mirrorvg rootvg hdisk1
# bosboot -a
# bootlist -m normal -o hdisk0 hdisk1
|
If the fix did
not work, then stay at the same ML version, and fix another day:
Confirm that
the disk will boot off hdisk1:
# bootlist -m normal -o hdisk1
|
Destroy cloned
disk (old_rootvg) hdisk0:
# alt_rootvg_op -X old_rootvg
|
Bring in hdisk0
into rootvg for mirroring:
# extendvg -f rootvg hdisk0
# mirrorvg rootvg hdisk0
# bosboot -a
# bootlist -m normal -o hdisk0 hdisk1
|
This article
showed that using the alt command is a quick way to recover rootvg if events go
wrong on an AIX upgrade and how to mount the cloned rootvg filesystem on a
running system. The alt command can also provide a path to a migration of a
rootvg disk to another hardware. It is also very useful in having two different
versions of AIX installed for testing migrating procedures.
No comments:
Post a Comment