Cash for Cache

I decided to build a new VMware host for my home “lab” last week to replace the HP workstation I had been using. (The real motive was to turn the HP workstation in to a large NAS since it has 12 SATA ports on it, but more on that later.) So off to part out my new server. What I ended up purchasing was the following (Prices as of 3/24/2015 in USD):

 

The plan was to set this system up with VMware vSphere 6 and then migrate everything from my VMware 5.1 system to this. So I began building it as the parts arrived last friday night. Everything was going swimmingly until I forgot that the LSI2308 SAS/SATA RAID card doesn’t have any cache. What I found was that the 2 480GB SSD drives in a RAID 1 on that card were fast, extremely fast, as in I could boot a Windows 7 or Windows 2012R2 VM in about 3 seconds. However the 2 2TB SATA drives that I made a RAID 1 on there were slow as hell. (Same as the issue I was having with the HPXW8600 system.) I had originally thought it was just the RAID rebuilding, so I left it at the RAID bios over night rebuilding the array.

Well after leaving it at 51% completed and going to bed, waking up 8 hours later and it was only at 63%, I knew that I would never be able to use the SATA drives as a hardware mirror on that device. So I powered it down and disconnected them from the LSI2308 and moved them over to the Intel SATA side of the motherboard. This is where things get interesting, as I really wanted to have a large 2TB mirrored datastore for some of my test vm’s that I didn’t run 24×7 (the ones I do are on the SSD RAID 1.) In order to achieve this I had to do some virtualization of my storage…

The easiest way I could get the “mirrored” datastore to work was to do the following:

  1. Install FreeNAS vm on the SSD drive (pretty simple a small 8GB disk with 8GB of ram, which would leave me 24GB of ram for my other VM’s.)
  2. On each of the 2TB disks, create a VMware datastore, I called them nas-1 and nas-2, but it can be anything you want.
  3. Next create a VMDK that takes up nearly the full 2TB  (or smaller in my case, I created two 980GB VMDK’s per each 2TB disk.)
  4. Now present the VMDK’s to the FreeNAS VM.
  5. Next create a new RAID 1 volume in FreeNAS using the 2 disks (or 4 in my case) presented to it.
  6. Create a new iSCSI share of the new RAID 1 volume.

Now comes the part that gets a little funky. Because I didn’t want the iSCSI traffic to affect my physical 1GB on the motherboard I created a new vSwitch but didn’t assign any physical adapters to it. I then created a VMkernel Port on it and assigned the local vSphere host to it with a new IP in a different subnet. I then added another ethernet (e1000) card to the FreeNAS VM and placed it in that same vSwitch and assigned it an IP in the same subnet as the vSphere host.

With the networking “done”, it is now time to add the iSCSI software adapter:

  1. In the vSphere Client, click on the vSphere host, and then configuration
  2. Under Hardware, select Storage Adapter, then click Add in the upper right.
  3. The select the iSCSI adapter and hit ok. You should now have another adapter called iSCSI Software Adapter, in my case it was called vmhba38.
  4. Click on the new adapter and then click Properties
  5. Next I clicked on the Dynamic Discovery tab and clicked Add.
  6. In the iSCSI Server address I ended the IP address I made on the FreeNAS box on the second interface (the one on the “internal vSwitch”)
  7. Click ok (assuming you didn’t change the port from 3260)
  8. Now if you go back and click Rescan All at the top, you should see your iSCSI device.
  9. Now we just need to make a datastore out of it, so click on Storage under the Hardware box
  10. Then Add Storage…
  11. Then follow through adding the Disk/LUN and the naming stuff.

You should now have a new iSCSI datastore on the 2 disks that were not able to be “hardware” mirrored. Using HD Tune in a Windows 7 VM on that datastore I got this:

HD Tune running in Windows 7

As you can see, the left side of the huge spike was actually the writing portion of the test, which got drowned out by the read side of the test. Needless to say the cache on the FreeNAS makes it read extremely fast. As an example a cold boot of this Windows 7 VM took about 45 seconds to get to the login screen from power on. However a reboot is about 15 seconds or less..

Now on the FreeNAS side here is what the CPU utilization looked like during the test:

FreeNAS CPU usage

You can see that is barely touched the CPU’s while the test was running. So lets look at the disk’s to see how they dealt with it:

FreeNAS disks

It looks like the writes were averaging around 17MB/s, which for a SATA/6Gbps drive is a little slow, but we are also doing a software raid, with cacheing being handled in memory on the FreeNAS side. The reads looked to be about double the writes, which is expected in a RAID 1 config.

The final graph I have from the FreeNAS is the internal network card:

FreeNAS Network

Here we can see the transfer rates appear to be pretty close to that of the disk side. This is however on the e1000 card. I have yet to try it with the VMXNET3 driver to see if I get any faster speeds or not.

While the above may not show very “high” transfer speeds, the real test was when I was transferring the VM’s from the HP box to the new one. Before I created the iSCSI datastore and was just using the straight LSI2308 RAID1 on the 2x 2TB disks, the write speed was so bad that it was going to take hours to move a simple 10GB VM. After making the switch, it was down to minutes. In fact the largest one I moved, was 123GB in size and took 138 minutes to copy using the ovftool method.

So why did I title this post Cash for Cache, quite simple, if I had more cash to spend on a RAID controller that actually had a lot of cache on it, and a BBU, I wouldn’t have had to go the virtualized FreeNAS route. I should also mention that I would NEVER recommend some one doing this in a production environment as their is a HUGE catch 22. If you only have one vSphere host and no shared storage, when you power off the vSphere side (and consequently the FreeNAS VM) you will lose the iSCSI datastore (which would be expected). The problem is when you power it back on, you have to go and rescan to find the iSCSI datastore(s) after  you boot the FreeNAS vm back up. Sure you could have the FreeNAS boot automatically, but I have not tested that yet and to see if vSphere will automatically scan the iSCSI again to find the FreeNAS share.

 

Looking to the future, if SSD’s drop in price to where they are about equal to current spindle disks, I will likely replace all the SATA hard drives with SSD drives and then this would be the fastest VMware server ever.

 

Moving VM’s between hosts

About a year ago I purchased a 1U IBM X3550 server to run VMware vSphere 5 on. While it was cool to have a server that had dual quad procs and 8 gig of ram in it, the noise it put off was too much for my family room. (Just think of half a dozen 1 inch fans running at 15,000RPM almost constantly.) Recently I have been spending more time in the family room and the noise has gotten to a level that it is almost impossible to do anything in the room with out hearing it. (Like watch tv, a movie, play a game, etc.) So I started looking at my favorite used hardware site, geeks.com, for a new “server”. Well it finally arrived today, an HP XW8600 workstation. It is another dual quad proc, however it has 16GB of ram, and 12 SATA ports and a larger case, and the best of all, almost absolutely quiet.

So with it installed, I needed to start moving the VM’s from the IBM Server to the HP Server. In an enterprise environment, this usually isn’t a problem as you usually have a shared storage (SAN) that each of the hosts connect to. Well in my little home lab I don’t have shared storage. I did try to use COMSTAR in Solaris 10 to export a “Disk” as an iSCSI target. While this would work, it was going to take forever to transfer 1TB of VM’s from one server to a VM running on my Mac and back to the new server.

So a googling I went, and what I found was a way easier way to copy the VM’s over. ovftool, which runs on Windows, Linux and Mac. What it does is allow you to export and import OVF files to a VMware host. The side benefit of that is that you can export from one and import to another all on one line.

So I downloaded the Mac version and started coping. The basic syntax is like this:


./ovftool -ds=TargetDataStoreName vi://root@sourcevSphereHost/SourceVM vi://root@destvSphereHost

So if one of my VM’s is called mtdew, and I had it thin provisioned on the source host and wanted it the same on the destination host, and my datastore is called “vmwareraid” I would run this:

./ovftool -ds=vmwareraid -dm=thin vi://root@ibmx3550/mtdew vi://root@hpxw8600

where ibmx3550 is the source server and hpxw8600 is the destination server. If you don’t specify the “-dm=thin” then when it is copied over, it will become a “thick” disk, aka us the entire space allocated when created. (I.E. a 50GB disk that only has 10GB in use would still use 50GB if the -dm=thin is not used.)

There are some gotchas that you will have to look out for:

  1. Network configs, I had one VM that had multiple internal network’s defined. Those were not defined on the new server, so there is a “mapping” that you have to do. I decided I didn’t need them on the new server so I just deleted them before I copied it over.
  2. VM’s must be in a powered off state. I tried them in a “paused” state and it did not want to run right.
  3. It takes time, depending on the speed of the network, disk, etc, it will take a lot of time to do this, and the VM’s have to be down while it happens. So definitely not a way to move “production” vm’s unless you have a maintenance window.
  4. It will show % complete as it goes, which is cool, but the way it does it is weird. It will show the % at like 11 or 12 and then I turn my head and all of the sudden it says it is completed.
  5. I did have some issues with a vm that I am not sure what happened to it, but when I try to copy it, I get an error: “Error: vim.fault.FileNotFound”… It may be due to me renaming something on the vm at some point in the past.

Hope this helps some other “home lab user”…

 

 

LDAP and BER size

I recently came across a unique problem that didn’t “stand” out until I got to thinking about a couple of different situations that I had tested this in. So the scenario is that I needed to create a static group of unique members in LDAP (Sun Directory Server Enterprise Edition 6.3.1 and/or Oracle Directory Server Enterprise Edition 11.1.5.0) that has a extremely huge amount of members in it. So I created the LDIF file with all 60,000+ uids in it and proceeded to run an ldapadd against the server with the file. Well it immediately would come back with:

adding cn=testgroup,ou=group,dc=sungeek,dc=net

However, when looking in the LDAP, the group never showed up. Also when you look at the access log on the server you would see something similar to this:

[07/Aug/2012:21:22:39 -0400] conn=3 op=-1 msgId=-1 – closing from 127.0.0.1:48160 – B1 – Client request contains an ASN.1 BER tag that is corrupt or connection aborted

Now some times, depending on the versions of LDAP server and ldapadd programs, I got a “broken pipe” right after the adding output.

As you can see from the output in the error log it is not very descriptive on what the actual error is. I know I spent about 6 hours looking in to it to figure out what was actually the problem. Well this morning I was poking around the cn=config docs and found this:

http://docs.oracle.com/cd/E19528-01/820-2495/nsslapd-maxbersize-5dsconf/index.html

What this document shows is the attribute nsslapd-maxbersize, which is:

Defines the maximum size in bytes allowed for an incoming message. This limits the size of LDAP requests that can be handled by Directory Server. Limiting the size of requests prevents some kinds of denial of service attacks.

The limit applies to the total size of the LDAP request. For example, if the request is to add an entry, and the entry in the request is larger than two megabytes, then the add request is denied. Care should be taken when changing this attribute.

So by DEFAULT it is set to 2MB. Well my LDIF file was over 3.5MB in size. Which means that it was too big for the addition. To change it, do an ldapmodify with this ldif:

dn: cn=config
changetype:modify
replace:nsslapd-maxbersize
nsslapd-maxbersize: ########

I changed mine to 6MB, or 6291456, to hopefully cover any sizable additions in the future. Once done I restarted the directory server and tested again, and everything was good. According to the docs, the max size you can make this attribute is 2GB in size, and a size of 0 means default to 2MB, or 2097152. I think Oracle needs to make the error that is in the access log a little more descriptive, like “hey your query/add is too big yo”.

Hope this helps some one..

 

 

Fixing Dynamic DNS

My Solaris machine that ran DHCP/DNS and Routing for my home network died tonight after having been running for over 3 and a half years no-stop. So I had to re-setup my dhcp and dns on another machine, luckly I had backed up the stuff that was on the old machine a month or so ago, but some info had changed. In particular was the Dynamic DNS that I had setup and linked with the DHCP server (I use ISC’s DHCP and DNS). So I got the backup restored on another server and everything running, but a couple of hosts would not work. Come to find out the backup I had was several months old (no problem the machine did not change that much), but what did change was my IP address to the world (It changed some time in march or april after having been the same for over 3 years).

Well I had forgot how to update the Dynamic DNS stuff so I had to go hunting. This is what I did:

1. You can update the info dynamicly using nsupdate (if you have it configured to do so, which I did). So I did the following:


#nsupdate

server 10.0.0.69

key dhcpupdate u23ove098uy2ok3n12339==

zone homenetwork.net

update delete homenetwork.net

send

update add homenetwork.net 18000 IN A 10.0.0.1

send

^D

So now that part worked, but I noticed that I screwed up one of the NS records (it had the ip with the domain) at some point. So again to delete and add a new NS record:


#nsupdate

server 10.0.0.69

key dhcpupdate u23ove098uy2ok3n12339==

update delete homenetwork.net. NS 10.0.0.69.homenetwork.net.

send

update add homenetwork.net. 86400 IN NS ns.homenetwork.net.

send

^D

So that is all fine and well, but I am used to editing the files by hand… Didn’t realize until tonight that I could actually still do that. Any one who has used DDNS from ISC will notice that in the zones directory there will be files with a .jnl attached to it for the zones that are dynamic dns enabled. Those files are binary so viewing them looks very weird. I always thought for some reason that those were “it”, and the files that I used to use were no longer used. But they are….. The old files are updated about every 15 minutes with the info that is in the jnl files. But if you want to edit the files like  you always have, but still use ddns, you can. All you need to do is “freeze” updates, edit the files and then “thaw” the zones. When you freeze the zone it will flush the info in the jnl files to the files you are used to editing. All you need to do is the following:


rndc freeze

Edit the files


rndc thaw

Your changes will now be available.