Back in October 2015, IBM announced the release of PowerHA 7.2 to align with the release of AIX 7.2 in December 2015. If your experience with migrating to PowerHA 7.1 from previous PowerHA versions was anything like mine, you’re probably still recovering from the battle scars incurred. (A friendly reminder that you should be at PowerHA 7.1 by now, as PowerHA 6.1 is EOS as of April 2015).
The good news is, the worst of it is over as the real pain was felt during the move from RSCT Topology Services to the use of Cluster Aware AIX in PowerHA 7.1. IBM has included additional integration with Cluster Aware AIX and has accounted for new features included with AIX 7.2 with the latest PowerHA 7.2 release.
Support for AIX Live Updates
AIX Live Update is a feature included with AIX 7.2 that allows you to perform TL/SP updates on a partition without rebooting.
I haven’t had a chance to see it in action just yet as AIX 7.2 is fairly new, but my understanding of the Live Update feature is that a temporary partition, essentially a clone of your running partition, is created. The actual updates occur on the temporary LPAR, and once completed, the temporary LPAR becomes active.
PowerHA 7.2 will recognize that a Live Update is occurring, will un-manage the resource group and stop cluster services on the running LPAR. Once the Live Update is completed, cluster services are started on the newly updated LPAR and the resource group is managed by PowerHA.
Nondisruptive Cluster Upgrades
If your cluster is currently running at PowerHA 7.1.3, you’ll be able to execute a nondisruptive upgrade to PowerHA 7.2. This means you’d need to stop cluster services on all nodes in the cluster and unmanage the resource groups. At that point, you would be able to upgrade the PowerHA filesets and simply start cluster services to manage the resource groups.
PowerHA 7.2 includes hooks into the LPM framework that will monitor for LPM events and account for any delays or freezes that may occur while the LPM operation is running. You’re able to set the following LPM specific variables:
HEARTBEAT_FREQUENCY_DURING_LPM – this variable can be used to specify a longer heartbeat failure interval during an LPM operation
LPM_POLICY – this variable can be used to specify whether you’d like the resource group to be managed or unmanaged during an LPM operation. Un-managing a resource group during an LPM operation would protect you from any unwanted cluster events in the middle of your LPAR move.
Automatic Cluster Repository Disk Replacement
This is really an enhancement of Cluster Aware AIX included in AIX 7.1 TL4 and later. In short, you’re able to define up to six disks as backup repository disks. In the event that your primary repository disk fails, the cluster repository data is restored to one of the defined backup repository disks.
Newly Added Verification Checks
The following verification checks are new in PowerHA 7.2:
- The reserve_policy value should not be single_path. Clear if reserve is single path.
- Check for consistency of /etc/filesystems: Do mount points exist etc.
- LVM PVID checks across LVM and ODM on various nodes.
- Exploit AIX Runtime Expert checks for LVM, and NFS.
- Check for network errors. If they cross a threshold (5% of packet count: receive and transmit) warn the administrator about the network issue,
- GLVM buffer size checks.
- Security configuration (password rules).
- Kernel parameters: tunables related to AIX network, VMM, and security.
Split Brain Enhancements
If you’ve supported PowerHA/HACMP for a number of years, there is a high probability that you’ve experienced a split brain condition at some point.
Serious data corruption can occur when your resource group is active across multiple nodes in a cluster so I am excited to see the following quarantine policies in PowerHA 7.2:
Disk fencing – uses SCSI-3 persistent reservation to prevent a problem node that thinks its active from doing any disk writes.
HMC active node shoot down – ensures that a dead node is forced down via an HMC operation before bringing a resource group online on a standby node.
Resource Optimized High Availability (ROHA)
Licensing is always a concern when scoping out a high availability solution, as you’re often left paying licensing costs for an inactive set of resources on standby nodes. With ROHA, you can allocate a minimal set of CPU and memory resources to a standby node, and dynamically increase resources during a failover or resource group move. The following table shows ROHA enhancements included with PowerHA 7.2:
[table id=3 /]
PowerHA SystemMirror 7.2 supports Power 6, and higher, servers.
AIX 6.1 technology level (TL) 09 and AIX 7.1 TL03 support PowerHA V7.2. Refer to those levels for specific hardware support.
Note that some of the advanced features available in PowerHA V7.2 are available only with the 2015 AIX releases: AIX 7.1 TL04 and AIX 7.2.
PowerHA Director GUI is not available or supported with PowerHA SystemMirror V7.2