OpenSolaris vs Solaris

This weekend I went to install the new Communications Suite with Convergence and I decieded to install OpenSolaris 2008.5 on my machine and put the Comms Suite in a zone on it (so I could easily blow it away after my testing was done..)

Let me be probably not the first to say that OpenSolaris != Solaris.. I have been using Solaris 10 since it was in beta, and OpenSolaris through me for a couple of loops…

First are some of the cool things I liked:

1. The interface, it is updated and seemed a lot faster.

2. The ease of “patching” only took about half an hour to do a pkg image update.

3. Zfs root made it easy to roll back changes..

Now the parts that i had problems with and did not like too well.

1. I had to download a driver for my ethernet card as the one Sun delivers (sk98sol) is still too old and did not support my card which is one built on to a 3+ year old motherboard.

2. To create a zone, you MUST have a network connection (and at least to the internet for the time being). This really made me mad as I sometimes don’t have access to the Internet, and if I need to create a zone, I don’t want to have to wait for it to download 200+ Mb of packages, that are already on the machine in the first place.

3. No more “full root zones”, I created a zone in the hopes of installing the Comms Suite in it, only to find out that it was not a full root zone and stuff that is required by the Comms Installer to be there wasn’t and therefor I could not install it… Such simple things like unzip and perl are missing from the newly created zone.

In the end, I ended up reinstalling the box with Solaris 10 05/08, which was a task in itself. See when you install OpenSolaris it makes the root drive zfs, and did  some weird things to the VTOC. Therefore when I went in to do the install of the “older” Solaris 10 05/08, the installer would show me the disk, let me “carve” it up like I wanted in the gui and via command line, but when the install went to go on, the installer always came back saying that there was not enough disk to install Solaris. What I ended up having to do was go and do a “format -e” and then fdisk and delete the Solaris partition that was made by OpenSolaris, and let the Solaris installer create it’s own fdisk partition again.

So after finally getting Solaris 10 installed and the latest Recommended/Security/Sun Alert patches put on, I called it a night and left the Comms install for next weekend.

Overall I think OpenSolaris is going in the right direction, but there needs to be a lot of things fixed in it.. The biggest is the zones, there should be an option for “cloning” the already installed OS, since it is already on a ZFS pool. The second is that there should be an option when creating the zone as to what kind of zone it should be, whether a full (which would load every package, so you don’t have to try and do it  your self), sparse or maybe a new one called Jail which has everything in it read only.

Solaris 10 with zones and patching

One little draw back I have noticed about using zones on solaris 10 is the amount of time it takes to patch a machine. Right now I am waiting on a SunFire 890 with 8 processors and 16gb of ram with 12 zones (counting the global) to finish patching. I started it at around 8:54 this morning and it is now 11:16 and there are still 2 or 3 patches left to go. Since all the zones are basicly sparse zones, I wonder why it takes so much longer to do the patching? I also hope all this patching fixes my Power supply problem We have replaced the power supplies a couple of times, and the power distribution bored. I put the latest OBP on it this morning and it did not seem to fix it either. So hopefully after this set of patches are done, I will have a better idea whether it is a hardware or software problem.

Interesting Sun Ray problem

I got called the other night by our operations group because the keyboard and mouse would not work on their 3 head group of Sun Ray 150’s. So I went in and killed their session and had them restart it, did not work. So I went looking in the log files and saw this:

Sep 11 17:53:41 [] 0x0.0x1c392b7 0:3:ba:3c:1b:c1 USB: enable change: 2 lost enable state!
Sep 11 17:53:41 [] 0x0.0x1c392b7 0:3:ba:3c:1b:c1 USB: enable change: 4 lost enable state!
Sep 11 20:28:44 [] 0x0.0x2a1 0:3:ba:3c:1b:c1 USB: usb port 1 overcurrent
Sep 11 20:28:46 [] 0x0.0x307 0:3:ba:3c:1b:c1 USB: usb port 2 overcurrent
Sep 11 20:28:46 [] 0x0.0x36d 0:3:ba:3c:1b:c1 USB: usb port 3 overcurrent
Sep 11 20:28:47 [] 0x0.0x3d3 0:3:ba:3c:1b:c1 USB: usb port 4 overcurrent
Sep 11 20:28:48 [] 0x0.0x439 0:3:ba:3c:1b:c1 USB: usb port 5 overcurrent
Sep 11 20:45:34 [] 0x0.0x291 0:3:ba:3c:1b:c1 USB: usb hub port 4 overcurrent!
Sep 11 20:45:35 [] 0x0.0x2f9 0:3:ba:3c:1b:c1 USB: usb hub port 1 overcurrent!
Sep 11 20:45:36 [] 0x0.0x35f 0:3:ba:3c:1b:c1 USB: usb hub port 2 overcurrent!
Sep 11 20:45:37 [] 0x0.0x3c5 0:3:ba:3c:1b:c1 USB: usb hub port 3 overcurrent!
Sep 11 20:45:38 [] 0x0.0x42b 0:3:ba:3c:1b:c1 USB: usb hub port 5 overcurrent!
Sep 11 20:46:21 [] 0x0.0x304 0:3:ba:3c:1b:c1 USB: usb hub port 1 overcurrent!
Sep 11 20:46:22 [] 0x0.0x36a 0:3:ba:3c:1b:c1 USB: usb hub port 2 overcurrent!
Sep 11 20:46:23 [] 0x0.0x3d0 0:3:ba:3c:1b:c1 USB: usb hub port 3 overcurrent!
Sep 11 20:46:24 [] 0x0.0x436 0:3:ba:3c:1b:c1 USB: usb hub port 4 overcurrent!
Sep 11 20:46:25 [] 0x0.0x49c 0:3:ba:3c:1b:c1 USB: usb hub port 5 overcurrent!

Well that could not be good. So I ended up going in to the office. Tried unplugging the Sun Ray and plugging it back in. This is when I saw the 9 D error icon. Nice little icon with a picture of a USB connector and a yellow triangle. So I unplugged it and disconnected the keyboard and mouse and then plugged it back in. Still got the same error. The funny thing about the error is, it is listed as this in the docs:

This is an over current condition on the USB bus, i.e., the total number of devices draws too much current . Consider using a powered hub.

So now I ended up swaping it out with one that was in my office and rebuilding the multi-head group, and they were all set. The interesting thing about it is that the status LED stayed green, instead of turning amber. So the next morning I tried it on a different server (the original server it was attached to is running SRSS 2.0 still) that was running SRSS 3.1, this time nothing showed up in the log files, but the Sun Ray still showed the USB 9 icon and the keyboard and mouse did not work. So I ended up calling it in for replacement. It is nice that the Sun Ray’s have a long warrenty period. This one was bought 2 or 3 years ago.

In an unrelated note, I have to go in early to get a power backplane replaced in one of our V890’s because we have went through three power supplies in the PS0 slot in under a month. The bad part about this is the 890 has 11 zones on it and 1TB of disk, so we are going to have some services out while Sun replaces the backplane and power supply. Hopefully this will fix it though.

Zfs on home server

Time to Backup 300+ gig of data: 8 hours
Time to install Solaris 10 06/06 : 1 hour
Time to fix the PATA controller card: 10 minutes
Time to create a mirrored 387 gig ZFS file system: 5.2 seconds

For some people MS Windows is the only thing they know, but for others if you use Solaris and ZFS, things go much faster and are much better.

Now that I have a ZFS file system, I can clone my zones instead of having to install each one. I really like zfs!

One note, that I came across creating a zfs file system. One of the disks previously had a ufs file system on it and it would not let me create the pool. to get around it I did this:

zpool create -f tempspace c1d0s0

The error that caused me to do this was:

# zpool create tempspace c1d0s0
invalid vdev specification
use '-f' to override the following errors:
/dev/dsk/c1d0s0 contains a ufs filesystem.

Solaris on Sun AMD cheaper than linux on whitebox

Just finished watching this video from Marc Andreeseen, talking about (New social web app site he has started). one of the things he talks about is the cost of using Linux vs Solaris, pretty interesting listen…

The numbers he mentioned: They had assumed that Linux on white boxes would be their best option and cheapest option because of a total open source stack of software. These are fully loaded costs per server for 36 months, including electricity, space and support:

– Intel whitebox hardware + Linux software: $10,350
– AMD whitebox hardware + Linux software: $9,180
– Non-Sun AMD (whitebox) hardware + Solaris: $5,700
– Sun’s AMD hardware + Solaris: $4,760

Pay close attention to the end of the video when he talks about Linux Support costs.