Use lkdev For Oracle ASM Disks and VSCSI Disks – But Use with Caution!

by | Sep 29, 2015

The AIX Logical Volume Manager (LVM) makes the management of disks, volume groups, and logical volumes very easy, especially when compared to many of the other UNIX and Linux volume managers. However, there are several instances where LVM passes control of the disks to another entity, and in these cases, it can appear that the […]

The AIX Logical Volume Manager (LVM) makes the management of disks, volume groups, and logical volumes very easy, especially when compared to many of the other UNIX and Linux volume managers. However, there are several instances where LVM passes control of the disks to another entity, and in these cases, it can appear that the disks are unused (which could lead to them being mistakenly changed or removed), when in fact the disks are actually in use. I will discuss two of these scenarios today – Oracle ASM disks on AIX servers and Virtual SCSI disks on VIO Servers. I’ll show how to use the “lkdev” command to make the disk usage obvious and to prevent unintended problems by changing or removing disks that are in use, and I’ll provide some warnings about dangers to look out for.

Oracle ASM Disks

Oracle offers a disk management option called ASM which it claims can offer some performance advantages over using the AIX LVM to manage the disks. I don’t want to discuss the pros and cons of using ASM disks – I’ll just offer a brief overview of how it works and how the disks look from the AIX perspective, and how the lkdev command can help.

To use ASM, you start by allocating and discovering disk storage just like you would for regular LVM usage. You then use the “mknod” command to establish a link between the hdisk and a new device name of your choosing (using the disk’s major/minor numbers to establish the link). You then grant the oracle userid or group write access to the new device name. At that point, Oracle can access and manage the disk, using the new device name. The problem is that when a system administrator issues the “lspv” command, it looks like the hdisk is unused, which could lead to an uninformed admin accidently changing or removing it.

Let’s look at a brief example with a hypothetical AIX server which initially only has a single rootvg disk:

# lspv
NAME            PVID                                VG              STATUS
hdisk0          00f66e57f9ceb0ac                    rootvg          active

The SAN admin allocates 2 new disks, and after discovering them with cfgmgr, you have:

# lspv
NAME            PVID                                VG              STATUS
hdisk0          00f66e57f9ceb0ac                    rootvg          active
hdisk1          none                                None
hdisk2          none                                None

On hdisk1, you use LVM commands to create a volume group called “oraclevg”, after which you have:

# lspv
NAME            PVID                                VG              STATUS
hdisk0          00f66e57f9ceb0ac                    rootvg          active
hdisk1          00f66e57fa1796bc                    oraclevg        active
hdisk2          none                                None

Now, with hdisk2 you use the “mknod” command to create a new ASM device called /dev/asm_disk2, and grant access to the oracle userid. Note that hdisk2 has major/minor numbers = 19, 18:

# mknod /dev/asm_disk2 c 19 18
# chown oracle:dba /dev/asm_disk2
# chmod 660 /dev/asm_disk2
# ls -l /dev/* | grep "19, 18"
crw-rw----    1 oracle   dba          19, 18 Sep  8 15:50 /dev/asm_disk2
brw-------    1 root     system       19, 18 Nov 22 2014  /dev/hdisk2
crw-------    1 root     system       19, 18 Nov 22 2014  /dev/rhdisk2

Oracle now has access to hdisk2 – however, if you use “lspv”, it still appears to be unused:

# lspv
NAME            PVID                                VG              STATUS
hdisk0          00f66e57f9ceb0ac                    rootvg          active
hdisk1          00f66e57fa1796bc                    oraclevg        active
hdisk2          none                                None

The “lkdev” command can help you in two ways:

  1. It “locks” the disk in the sense that it can’t be altered (via the chdev command) or removed (via the rmdev command). However, it can still be used by Oracle or whatever is using it.
  2. It allows you to add a comment, so you can indicate the usage of the disk.

So, let’s lock the disk and add our ASM device name “asm_disk2” as our comment, and see how it looks in lspv:

# lkdev -l hdisk2 -a -c asm_disk2
# lspv
NAME            PVID                                VG              STATUS
hdisk0          00f66e57f9ceb0ac                    rootvg          active
hdisk1          00f66e57fa1796bc                    oraclevg        active
hdisk2          none                                asm_disk2       locked

Our “asm_disk2” comment shows up where the volume group is normally listed, and the status shows “locked” instead of being blank. It is easy to see at a glance what the disk is used for, and we can no longer accidentally (or on purpose) delete the disk:

# rmdev -dl hdisk2
rmdev: 0514-558 Cannot perform the requested function because hdisk2
        is currently locked.

We have to unlock the disk if we want to change or delete the disk. And we do that with the lkdev command again, by using the “-d” flag:

# lkdev -l hdisk2 –d
hdisk2 unlocked
# lspv
NAME            PVID                                VG              STATUS
hdisk0          00f66e57f9ceb0ac                    rootvg          active
hdisk1          00f66e57fa1796bc                    oraclevg        active
hdisk2          none                                None

VSCSI Disks

If you use VIO Servers and map hdisks to the client LPARs using Virtual SCSI adapters, you encounter a similar situation to the ASM disks, where the disks are in use, but appear to be unused when listing them with the lspv command. And likewise, the lkdev command can help to more easily identify the usage of the disks, and to prevent unwanted changes to the disks.

Let’s go through another brief example. We have a VIO Server that has 3 disks – hdisk0 is used for the VIO Server’s rootvg disk, and the other 2 disks (hdisk1 and hdisk2) are currently unused. After logging on as padmin, we map hdisk1 and hdisk2 to two different Virtual SCSI server adapters, to be used as rootvg disks by two different client LPARs:

$ lspv
NAME            PVID                                VG              STATUS
hdisk0          00ceab3c5d51f0ea                    rootvg          active
hdisk1          00ceab3c6edbaec5                    None
hdisk2          00ceab3c8e55a8dd                    None
$ mkvdev -vdev hdisk1 -vadapter vhost0 -dev client1_rootvg
$ mkvdev -vdev hdisk2 -vadapter vhost1 -dev client2_rootvg
$ lsmap -vadapter vhost0
SVSA            Physloc                                      Client Partition ID
--------------- -------------------------------------------- ------------------
vhost0          U7879.001.DQM0Z74-P1-C5-T1-W507680140EC78-L6 0x00000003

VTD                   client1_rootvg
Status                Available
LUN                   0x8100000000000000
Backing device        hdisk1
Physloc               U7311.D11.102199B-P1-C2-T1
Mirrored              false
$ lsmap -vadapter vhost1
SVSA            Physloc                                      Client Partition ID
--------------- -------------------------------------------- ------------------
vhost1          U7879.001.DQM0Z74-P1-C5-T1-W507680140C639-L3 0x00000004

VTD                   client2_rootvg
Status                Available
LUN                   0x8100000000000000
Backing device        hdisk2
Physloc               U7311.D11.102109A-P1-C2-T1
Mirrored              false

So, hdisk1 and hdisk2 are now assigned to client LPARs, but you cannot tell that by using lspv:

$ lspv
NAME            PVID                                VG              STATUS
hdisk0          00ceab3c5d51f0ea                    rootvg          active
hdisk1          00ceab3c6edbaec5                    None
hdisk2          00ceab3c8e55a8dd                    None

So, let’s lock the disks and this time we’ll add the device mapping name as our comments for the hdisks. We’ll have to use the oem_setup_env command to switch to root prior to executing the lkdev commands, since lkdev is not available in the padmin shell:

$ oem_setup_env
# lkdev -l hdisk1 -a -c client1_rootvg
# lkdev -l hdisk2 -a -c client2_rootvg
# lspv
NAME            PVID                                VG              STATUS
hdisk0          00ceab3c5d51f0ea                    rootvg          active
hdisk1          00ceab3c6edbaec5                    client1_rootvg  locked
hdisk2          00ceab3c8e55a8dd                    client2_rootvg  locked

Now, it is very easy for us to see how our disks are being utilized, whether we are in the padmin or root shells. And the disks are locked so they can’t mistakenly be removed:

# rmdev -dl hdisk1
rmdev: 0514-558 Cannot perform the requested function because hdisk1
        is currently locked.

Final Thoughts

The lkdev command adds a layer of protection to vulnerable disks that are in use on AIX servers, but not managed by the AIX LVM. It is not without flaws, however, and caution should be exercised when using it. It blocks the use of the chdev and rmdev commands, but some disk-altering commands will still work, even on disks that are locked. For example, locked disks can still be added to an existing volume group (extendvg), or be used to create a new volume group (mkvg). Some of these commands may fail because the disks are “busy”, but using a force flag could override it, causing problems.

An extra word of caution if you plan to use lkdev on VIO Servers (or AIX servers without recent service packs) – I tested lkdev on two different versions of VIO software (v2.2.3.52 and v2.2.3.3), and ran into problems on both versions.

With the more recent version, v2.2.3.52, the lkdev command successfully locked the disk. However, when I tried to unlock it, I was met with the following error:

# lkdev –l hdisk1 –a –c lkdev_test1
hdisk1 locked
# lspv | grep hdisk1
hdisk1         00c991fa4ea0351d                   lkdev_test1     locked
# lkdev -l hdisk1 -d
lkdev: 0514-518 Cannot access the CuLk object class in the device
        configuration database.

The only way to unlock it at that point was by using ODM commands:

# lspv | grep hdisk1
hdisk1         00c991fa4ea0351d                   lkdev_test1     locked
# odmget CuLk
CuLk:
        name = "hdisk1"
        status = 8192
        label = "lkdev_test1"
# odmdelete -q name="hdisk1" -o CuLk
0518-307 odmdelete: 1 objects deleted.
# lspv | grep hdisk1
hdisk1         00c991fa4ea0351d                   None

The older version of VIO software, v2.2.3.3, was actually worse – the lkdev command seemingly locked the disk – however you could still rmdev the disk!

# lkdev –l hdisk1 –a –c lkdev_test1
hdisk1 locked
# lspv | grep hdisk1
hdisk1         00c991fa4ea0351d                   lkdev_test1     locked
# rmdev -dl hdisk1
hdisk1 deleted

That version of VIO software also had the same ODM issue that the newer version had, but that problem seems minor when compared to being able to remove a “locked” disk.

I opened a PMR with IBM, and they quickly issued ifixes for both versions of VIO Software, pertaining to existing APAR IV74417, which addresses the ODM issue. I downloaded and installed both ifixes (IV74417s3a for VIOS v2.2.3.3 and IV74417s5a for VIOS v2.2.3.52). No reboot was necessary and both ifixes indeed solved the ODM problem. However, on VIOS v2.2.3.3, the more significant problem of lkdev not truly locking the disk still existed. I informed IBM Support and they indicated that it isn’t a problem with lkdev – rather it is a problem with the rmdev command. There is already an APAR for this issue (IV56840) – they are creating an ifix for it, and I’ll apply that when it is ready.

Note that regular AIX servers may be vulnerable to these issues as well (if patches aren’t up to date), but the versions I checked were OK.  The bottom line is that the lkdev command can be a useful tool in your arsenal. Just be aware of the limitations, and regardless of whether you are using it on AIX or VIOS – test it out first!

Table of Contents
2
3

Related Articles