OpenSolaris vs Solaris

This weekend I went to install the new Communications Suite with Convergence and I decieded to install OpenSolaris 2008.5 on my machine and put the Comms Suite in a zone on it (so I could easily blow it away after my testing was done..)

Let me be probably not the first to say that OpenSolaris != Solaris.. I have been using Solaris 10 since it was in beta, and OpenSolaris through me for a couple of loops…

First are some of the cool things I liked:

1. The interface, it is updated and seemed a lot faster.

2. The ease of “patching” only took about half an hour to do a pkg image update.

3. Zfs root made it easy to roll back changes..

Now the parts that i had problems with and did not like too well.

1. I had to download a driver for my ethernet card as the one Sun delivers (sk98sol) is still too old and did not support my card which is one built on to a 3+ year old motherboard.

2. To create a zone, you MUST have a network connection (and at least to the internet for the time being). This really made me mad as I sometimes don’t have access to the Internet, and if I need to create a zone, I don’t want to have to wait for it to download 200+ Mb of packages, that are already on the machine in the first place.

3. No more “full root zones”, I created a zone in the hopes of installing the Comms Suite in it, only to find out that it was not a full root zone and stuff that is required by the Comms Installer to be there wasn’t and therefor I could not install it… Such simple things like unzip and perl are missing from the newly created zone.

In the end, I ended up reinstalling the box with Solaris 10 05/08, which was a task in itself. See when you install OpenSolaris it makes the root drive zfs, and did  some weird things to the VTOC. Therefore when I went in to do the install of the “older” Solaris 10 05/08, the installer would show me the disk, let me “carve” it up like I wanted in the gui and via command line, but when the install went to go on, the installer always came back saying that there was not enough disk to install Solaris. What I ended up having to do was go and do a “format -e” and then fdisk and delete the Solaris partition that was made by OpenSolaris, and let the Solaris installer create it’s own fdisk partition again.

So after finally getting Solaris 10 installed and the latest Recommended/Security/Sun Alert patches put on, I called it a night and left the Comms install for next weekend.

Overall I think OpenSolaris is going in the right direction, but there needs to be a lot of things fixed in it.. The biggest is the zones, there should be an option for “cloning” the already installed OS, since it is already on a ZFS pool. The second is that there should be an option when creating the zone as to what kind of zone it should be, whether a full (which would load every package, so you don’t have to try and do it  your self), sparse or maybe a new one called Jail which has everything in it read only.

Solaris 10 Security exam and bugs?

So I started studying for the Solaris 10 Sun Certified Security Administrator test. I installed a couple copies of Solaris 10 in vmware so I could test somethings and not even all the way in to chapter three of the System Administration Guide: Security Services manual and I think I found a bug. The bug I found, is in the “logins” command. For example if you have the CRYPT_DEFAULT set to __unix__ and set a password for a user and run the logins -x -l username command you get this:

# useradd -m -d /export/home/test1 test1
64 blocks
# passwd test1
New Password:
Re-enter new Password:
passwd: password successfully changed for test1
# grep test1 /etc/shadow
# logins -x -l test1
test1 102 other 1
PS 071806 -1 -1 -1

Everything seems ok, PS means the password is set. Now if you change the CRYPT_DEFAULT to md5 and do the same thing you get this:

# useradd -m -d /export/home/test2 test2
64 blocks
# passwd test2
New Password:
Re-enter new Password:
passwd: password successfully changed for test2
# grep test2 /etc/shadow
# logins -x -l test2
test2 103 other 1
LK 071806 -1 -1 -1

For some reason it now reports the user is “locked”, but it really isn’t, I can log in to the account. I then tried to login to the account and change the password. That works, but the logins command still shows that the account is locked. So I went to the to search for the bug, it is there 5003383, and a state of fixed, but doesn’t really say what date it was fixed on. So I am going to guess it has not made it in to Solaris 10 06/06 release yet. (weird since it was reported back in 2004). Going to search for another bug now. The logins -p command does not show there are accounts with out passwords either if the md5 is enabled, off to see if I can find that bug as well.

Interesting Article on IBM’s site

While I was trying to wade through to find if they are ever going to port Tivoli Storage Manager to Solaris X86, I found this article: Guide to porting from Solaris to Linux on Power. First off let me say, I would never think of doing this, I like Solaris way better than I do Linux, and if I were to port something to Linux, it would not be on Power. (Because I can’t personally afford any IBM Power computers.) Aside from that, the most interesting part is the Summary:

The porting effort from Solaris to Linux on POWER in most cases involves just a recompile or minor changes in compiler/linker switches. However, the design of Solaris and Linux is fundamentally different. Solaris is focused on performance, scalability, and reliability, while sacrificing portability. On the other hand, Linux is designed with portability in mind, and it is supported on almost all hardware platforms available today. The Linux 2.6 kernel, however, has significantly improved performance, scalability and reliability from the 2.4 kernel. As a result, there are some system-specific features available on Solaris that are not available on Linux.

What is interesting is that I have never had a lot of portability problems with going from Solaris to Linux. If you write the software correctly the first time then it should be just a recompile. And when did Solaris start sacrificing portability? If it is anything I have seen it be the other way around. People writing non-portable code on linux and then we have to clean it up to make it run on Solaris or any other OS besides Linux.

But my favorite part is that BrandZ is now in OpenSolaris now. What does this mean? As quoted from the page:

What is BrandZ?

BrandZ is a framework that extends the Solaris Zones infrastructure to create Branded Zones, which are zones that contain non-native operating environments. The term “non-native” is intentionally vague, as the infrastructure allows for the creation of a wide range of operating environments.

Each operating environment is provided by a brand that plugs into the BrandZ framework. A brand may be as simple as an environment with the standard Solaris utilities replaced by their GNU equivalents, or as complex as a complete Linux userspace.

BrandZ extends the Zones infrastructure in user space:

* A brand is an attribute of a zone, set at zone create time
* Each brand provides its own installation routine, which allows us to install an arbitrary collection of software in the branded zone.
* Each brand may provide pre/post-boot scripts that allows us to do any final boot-time setup or configuration.
* The zoneadm and zonecfg tools can set and report a zone’s brand type.

BrandZ provides a set of interposition points in the kernel:

* These points are found in the syscall path, process loading path, thread creation path, etc.
* At each of these points, a brand may choose to supplement or replace the standard Solaris behavior.
* These interposition points are only applied to processes in a branded zone
* Fundamentally different brands may require new interposition points

Did you say something about Linux?

The lx brand enables Linux binary applications to run unmodified on Solaris, within zones running a complete Linux userspace. The combination of BrandZ and the lx brand will be productized as Solaris Containers for Linux Applications.

The lx brand is not a Linux distribution and does not contain any Linux software at all. The lx brand enables user-level Linux software to run on a machine with a Solaris kernel, and includes the tools necessary to install a CentOS or Red Hat Enterprise Linux distribution inside a zone on a Solaris system.

The lx brand will run on x86/x64 systems booted with either a 32-bit or 64-bit kernel. Regardless of the underlying kernel, only 32-bit Linux applications are able to run.

We do not support SPARC linux. This might be an interesting community project, but it’s not on our roadmap.

BrandZ/lx is still very much a work in progress. This means that it should be expected to crash at any time, set fire to your datacenter, and kick your cat.

Running BrandZ is fairly straightforward, but installing it requires a significant level of technical expertise and familiarity with OpenSolaris development procedures. We have provided some documentation to help climb the learning curve, but if you are not comfortable BFUing your system (or if you don’t even know what that means :), then you will probably be better off waiting until the project is more polished and user-friendly. You are, of course, welcome to try it out and ask questions on the discussion board, but please understand if we cannot provide detailed, hands-on support.

So hopefully you would be able to run Linux inside of a Solaris Container and not need to port anything to any platform.

As a Footnote to my original search in IBM’s site, why is it that they have a TSM Client for Linux on Power, X86, zSeries and iSeries? Why can’t they do a simple port to Solaris X86? You can’t tell me that there are that many people running TSM Client on Linux that is running on a zSeries Mainframe or an iSeries (AS/400) machine.


TSM controled by SMF part 3

Made one little enhancement to the manifest for controlling the TSM Scheduler via SMF.. This is useful if you are running multiple scheduler processes on the same machine in different zones. (Which is “not supported” by IBM). Also to make TSM work in a zoned environment (where /usr is read only in the zones) you will need to remove the symlinks from /opt/tivoli/tsm/client/ba/bin/dsm.[opt|sys] to /usr/bin/dsm.[opt|sys] and make the links the other way around, by putting the actual config files in /opt/…/ba/bin/dsm.[opt|sys] and making a symlink to them from /usr/bin/dsm.[opt|sys]. This allows for per zone config’s. In my previous manifest if you stopped and restarted the dsmc scheduler in the global zone it killed the schedulers in all zones. So I have changed the pkill command to just be pkill -z`zonename` dsmc This way only the zone you want it to die in will get touched…(Note, those are back tick’s around the zonename command)

Here is the manifest:

< ?xml version="1.0"?>
< !DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">


TSM Client on SMF

With the help of Stephen from Sun, I got the TSM scheduler running in SMF. Here is the manifest:

< ?xml version="1.0"?>
< !DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

Solaris, SMF, OpenSolaris, Tivoli