As I’m sure many people have run into this before, and from personal experience, found nothing roaming the Interwebs on how to fix it, and seeings how I just fixed it, I’ll write up a little how to fix it post.
The set up:
Two SunFire X2100 M2 servers connected to a StorageTek 2530 via iscsi. The two nodes are running CentOS 6.2 with RedHat cluster software. I have a server running Nagios for monitoring and it checks for failed disks on the StorageTek by running a script on either node that returns the number of “optimal” and “failed” disks.
Updating system software is important. Keeping packages up to date protects from security vulnerabilities. Unfortunately, sometimes it breaks things. In this case, updating the suggested packages broke my set up, making it so that the Sun Storage CAM (Common Array Manager) software did not work anymore.
I first became alerted to this when Nagios sent me errors from the script checking the StorageTek disks. I checked the command that runs in the script to see what was up, and it returned several errors. Here they are for future googlers:
sscs list -i 192.168.128.101 device
returned “Command failed due to an exception. null” and
sscs list -a arrayname host
returned “arrayname : The resource was not found.”
Not particularly helpful messages.
Fortunately the nodes could still mount the array partitions, which allowed them to continue running as web and mysql servers. I just couldn’t run management commands on the array.
Since some of the required software for CAM was updated, I supposed that was causing the issue. The required software is listed below:
- libXtst-220.127.116.11-3.el6.i686.rpm and its dependent rpm (InstallShield requirement)
I couldn’t figure out on my own exactly what was wrong, so I contacted Oracle support, and they finally tipped me off to the solution. Completely remove the CAM software and reinstall it. Those steps are outlined below:
- Go to the CAM software folder in
- Here is a good spot to run
yum updateand restart the server if needed.
- Change directories to where you have the CAM software install CD. There should be a folder called
componentsin there. Change into that directory and install the jdk available there:
rpm -Uvh jdk-6u20-linux-i586.rpm
- Next run the
RunMe.binfile in the CAM Software CD folder.
- Install the RAID Proxy Agent package located in the Add_On/RaidArrayProxy directory of the latest CAM software distribution.
rpm -ivh SMruntime.xx.xx.xx.xx-xxxx.rpm
rpm -ivh SMagent-LINUX-xx.xx.xx.xx-xxxx.rpm
- Register the array with the host/node. This process can take several minutes.
sscs register -d storage-system
One additional issue I ran into, was that some update or other process shutdown the NIC connecting the node to the array. I had to make sure that was running before I ran the register -d storage-system command above.