One little draw back I have noticed about using zones on solaris 10 is the amount of time it takes to patch a machine. Right now I am waiting on a SunFire 890 with 8 processors and 16gb of ram with 12 zones (counting the global) to finish patching. I started it at around 8:54 this morning and it is now 11:16 and there are still 2 or 3 patches left to go. Since all the zones are basicly sparse zones, I wonder why it takes so much longer to do the patching? I also hope all this patching fixes my Power supply problem We have replaced the power supplies a couple of times, and the power distribution bored. I put the latest OBP on it this morning and it did not seem to fix it either. So hopefully after this set of patches are done, I will have a better idea whether it is a hardware or software problem.
I got called the other night by our operations group because the keyboard and mouse would not work on their 3 head group of Sun Ray 150’s. So I went in and killed their session and had them restart it, did not work. So I went looking in the log files and saw this:
Sep 11 17:53:41 [10.198.11.221.2.2] 0×0.0×1c392b7 0:3:ba:3c:1b:c1 USB: enable change: 4 lost enable state!
Sep 11 20:28:44 [10.198.11.221.2.2] 0×0.0×2a1 0:3:ba:3c:1b:c1 USB: usb port 1 overcurrent
Sep 11 20:28:46 [10.198.11.221.2.2] 0×0.0×307 0:3:ba:3c:1b:c1 USB: usb port 2 overcurrent
Sep 11 20:28:46 [10.198.11.221.2.2] 0×0.0×36d 0:3:ba:3c:1b:c1 USB: usb port 3 overcurrent
Sep 11 20:28:47 [10.198.11.221.2.2] 0×0.0×3d3 0:3:ba:3c:1b:c1 USB: usb port 4 overcurrent
Sep 11 20:28:48 [10.198.11.221.2.2] 0×0.0×439 0:3:ba:3c:1b:c1 USB: usb port 5 overcurrent
Sep 11 20:45:34 [10.198.11.221.2.2] 0×0.0×291 0:3:ba:3c:1b:c1 USB: usb hub port 4 overcurrent!
Sep 11 20:45:35 [10.198.11.221.2.2] 0×0.0×2f9 0:3:ba:3c:1b:c1 USB: usb hub port 1 overcurrent!
Sep 11 20:45:36 [10.198.11.221.2.2] 0×0.0×35f 0:3:ba:3c:1b:c1 USB: usb hub port 2 overcurrent!
Sep 11 20:45:37 [10.198.11.221.2.2] 0×0.0×3c5 0:3:ba:3c:1b:c1 USB: usb hub port 3 overcurrent!
Sep 11 20:45:38 [10.198.11.221.2.2] 0×0.0×42b 0:3:ba:3c:1b:c1 USB: usb hub port 5 overcurrent!
Sep 11 20:46:21 [10.198.11.221.2.2] 0×0.0×304 0:3:ba:3c:1b:c1 USB: usb hub port 1 overcurrent!
Sep 11 20:46:22 [10.198.11.221.2.2] 0×0.0×36a 0:3:ba:3c:1b:c1 USB: usb hub port 2 overcurrent!
Sep 11 20:46:23 [10.198.11.221.2.2] 0×0.0×3d0 0:3:ba:3c:1b:c1 USB: usb hub port 3 overcurrent!
Sep 11 20:46:24 [10.198.11.221.2.2] 0×0.0×436 0:3:ba:3c:1b:c1 USB: usb hub port 4 overcurrent!
Sep 11 20:46:25 [10.198.11.221.2.2] 0×0.0×49c 0:3:ba:3c:1b:c1 USB: usb hub port 5 overcurrent!
Well that could not be good. So I ended up going in to the office. Tried unplugging the Sun Ray and plugging it back in. This is when I saw the 9 D error icon. Nice little icon with a picture of a USB connector and a yellow triangle. So I unplugged it and disconnected the keyboard and mouse and then plugged it back in. Still got the same error. The funny thing about the error is, it is listed as this in the docs:
This is an over current condition on the USB bus, i.e., the total number of devices draws too much current . Consider using a powered hub.
So now I ended up swaping it out with one that was in my office and rebuilding the multi-head group, and they were all set. The interesting thing about it is that the status LED stayed green, instead of turning amber. So the next morning I tried it on a different server (the original server it was attached to is running SRSS 2.0 still) that was running SRSS 3.1, this time nothing showed up in the log files, but the Sun Ray still showed the USB 9 icon and the keyboard and mouse did not work. So I ended up calling it in for replacement. It is nice that the Sun Ray’s have a long warrenty period. This one was bought 2 or 3 years ago.
In an unrelated note, I have to go in early to get a power backplane replaced in one of our V890’s because we have went through three power supplies in the PS0 slot in under a month. The bad part about this is the 890 has 11 zones on it and 1TB of disk, so we are going to have some services out while Sun replaces the backplane and power supply. Hopefully this will fix it though.
Time to Backup 300+ gig of data: 8 hours
Time to install Solaris 10 06/06 : 1 hour
Time to fix the PATA controller card: 10 minutes
Time to create a mirrored 387 gig ZFS file system: 5.2 seconds
For some people MS Windows is the only thing they know, but for others if you use Solaris and ZFS, things go much faster and are much better.
Now that I have a ZFS file system, I can clone my zones instead of having to install each one. I really like zfs!
One note, that I came across creating a zfs file system. One of the disks previously had a ufs file system on it and it would not let me create the pool. to get around it I did this:
The error that caused me to do this was:
invalid vdev specification
use ‘-f’ to override the following errors:
/dev/dsk/c1d0s0 contains a ufs filesystem.
Just finished watching this video from Marc Andreeseen, talking about ning.com (New social web app site he has started). one of the things he talks about is the cost of using Linux vs Solaris, pretty interesting listen…
The numbers he mentioned: They had assumed that Linux on white boxes would be their best option and cheapest option because of a total open source stack of software. These are fully loaded costs per server for 36 months, including electricity, space and support:
- Intel whitebox hardware + Linux software: $10,350
- AMD whitebox hardware + Linux software: $9,180
- Non-Sun AMD (whitebox) hardware + Solaris: $5,700
- Sun’s AMD hardware + Solaris: $4,760
Pay close attention to the end of the video when he talks about Linux Support costs.
Since i have not had a chance yet to look at the new 5.3.3 client to see if/how they fixed the zones problem I will post how I did it.
First off, for what ever reason IBM/Tivoli decided that the config files (dsm.sys/dsm.opt) should go in /usr/bin. Why I don’t have a clue but that is not a place where they should go. What is even worse is that when you install the client it puts symlinks to /usr/bin/dsm.[opt|sys] in the /opt/tivoli/tsm/client/ba/bin directory.
lrwxrwxrwx 1 root bin 33 Dec 21 08:21 dsm.opt -> ../../../../../../usr/bin/dsm.opt
-r–r–r– 1 root bin 782 May 18 2005 dsm.opt.smp
lrwxrwxrwx 1 root bin 33 Dec 21 08:21 dsm.sys -> ../../../../../../usr/bin/dsm.sys
-r–r–r– 1 root bin 971 May 18 2005 dsm.sys.smp
What is even better is how they make the symlinks… Any ways to get TSM to work in zones, what I did was change the order the symlinks are. I put the actual config files in /opt/tivoli/tsm/client/ba/bin and then did a symlink in /usr/bin to the /opt/tivoli/tsm/client/ba/bin directory in the global zone (as the /usr filesystem in the non global zones are read only), so now I have this:
lrwxrwxrwx 1 root root 37 Jul 22 2005 dsm.opt -> /opt/tivoli/tsm/client/ba/bin/dsm.opt
lrwxrwxrwx 1 root root 37 Jul 22 2005 dsm.sys -> /opt/tivoli/tsm/client/ba/bin/dsm.sys
This way each zone can have their own dsm.[opt|sys].. There is one “gotcha” with this method, make sure you back up your files before you upgrade the client. I am not sure at the moment whether they would be removed if you upgrade the client or not. Technically the files (dsm.[sys|opt]) should go in /etc and then symlinks from /usr/bin and /opt/tivoli/tsm/client/ba/bin to them.
N.B. This is probably unsupported by IBM and you use at your own risk.
Technorati Tags: Tivoli, TSM, Solaris, Containers, Zones
