Three Platforms, One Server Part 7: Testing...

So, our primary authentication server, which will be used to enable network home accounts for our entire internal network — Mac, Windows and Linux — is up and running. I've installed it in our server room, put it on the KVM, and switched over all the workstations. So far, so good. (My fingers are so crossed they hurt.)

For the Windows quota requirements we went with the first solution mentioned in this article, and outlined in detail here: Windows roaming profiles are on a separate, quota-enabled volume on the authentication server. This — and the Windows implementation in general — is my biggest concern, and will require a great deal of testing. Summer Session has begun, and this will give us the opportunity to test on live subjects as a limited pool of students begins using the systems for work again.

There are a couple of immediate benefits of this new system. For one, users can now change their password lab-wide from any Mac in the lab. Now, users don't tend to do this much, but they can if they want to, and it's easy as pie. But as an admin, the most exciting benefit to all this is the ease with which new users can now be created. In the past, creating a new user was a multi-step, multi-computer process that involved coordination between multiple sysadmins: Admin 1 gets the user info, creates the Linux and Windows accounts and hands it off to admin 2, who then creates the Mac account and uploads the Mac skel. This process was lengthy and extremely error-prone. Using the new system, user creation is a breeze. One single, easy-to-use script on the authentication server pretty much does it all. Any sysadmin — or even a trained assistant — can do it. Just log in to the authentication server, run the script, and you're basically done. It's fast, easy and accurate. Which is really the whole reason we're doing all this. I'm pretty happy about it.

There are a few minor details I need to work out. The main one is file sharing. In addition to being an authentication server, our original Mac server is also a file server. On it there are numerous shares for various purposes — one for staff, one for students, etc.. At this point I'm thinking of keeping the file server and the authentication server separate. This will reduce the load on both, and mitigate the need to migrate any data. The catch is that user authentication data on the file server will need to match that of the new authentication server. This should be a simple matter of changing the old server's role to "Connected to a Directory System" and pointing it at the new authentication server. I've never done this though, and it's complicated by the fact that the old server is running Panther whereas the new one is running Tiger. Probably the best thing to do will be to wipe the old server (yes, I have a backup) and put Tiger on it, then redo the shares. But I'll probably test with Panther first, just to see what happens. I'll let you know.

In any case, this is, again, a minor problem that shouldn't affect the overall plan much and should be relatively easy to figure out and implement. It's just one of those little things you forget about when you're drawing up your master plan for world conquest. "Oh yeah... What about the file server?..."

Otherwise, the conversion is on and seems to be going smoothly so far. Once we get into serious testing I'll post the results. Hopefully this will all be done in the next few weeks and the conversion will finally be complete.

Three Platforms, One Server Part 6: More Windows Quota Problems

Of course I knew it was too good to be true. I've found the first fatal flaw in my plan to unify authentication on the internal network. It goes back to the Windows quotas problem I studied some time ago, and to which I'd thought I'd found a solution.

I won't go into great detail about the problems and various solutions to the Windows roaming profile issue. I've already written plenty on it, and the previous posts outline it all fairly well. I will say that the intended solution was to provide Windows roaming profile quotas by setting them locally on each workstation. But, last week, as we moved forward with the plan, one of my fellow sysadmins, who is far more capable with Windows than I happen to be, pointed out the fact that certain applications (i.e. Photoshop, Maya, etc.) need a certain amount of disk space for temp files and what-not in order to operate. Setting small local quotas effectively keeps these applications from running properly.

We are currently testing a few other scenarios in which quotas for Windows roaming profiles can be implemented to our satisfaction:

  1. The Authentication Server (Mac OS X Server 10.4.6) has a separate volume for Windows roaming profile storage and that drive has quotas enabled (this was the original plan). The drawback to this is that the user's home account data is stored separately for Mac and Linux — which keep this data on the home account server — than for Windows, which would store this data on a reserved volume. The other drawback is the fact that the client machine is not aware of the quotas until the user logs out and the Windows client attempts to upload the new data, at which point Windows issues an error if the user has exceeded his quota. But the user is not warned about quota violations until logout, and this could cause some minor problems.
  2. The Windows Server continues to host roaming profiles and determine quotas for Windows users, but gets user authentication information through a connection to our Mac Authentication Server. The drawback here is that we have to continue using the Windows Server, which we don't really want to do, though it does seem to give us a slightly higher level of control than does the Mac Server. This method is also complicated by the fact that we are currently running Windows Server 2000, which does not include native authentication to LDAP, so third-party solutions would be required. This method could also complicate user creation.
  3. Through some combination of Windows and Mac Servers, we convince the Windows roaming profiles to be situated on the Temp volume of the local workstations, rather than the default location on the "C" drive in the "Documents and Settings" folder, and then set quotas for the Temp volume. I'm skeptical that this is even possible. Windows seems to be hard-coded in ways that make specifying the location of roaming profiles anywhere other than "C:\Documents and Settings" impossible. So this last option seems the least likely to succeed, though if it did it would match the way Mac and Linux behave much more closely. And if we could figure out how to do this without the Windows server, it would be almost ideal.

(Actually, I just thought of a problem with this last method: The Windows Temp drives get erased every Friday. If users happen to be working during the deletion period, what happens to their roaming profiles? The same thing happens on the Mac, but the deletion script on Mac does not delete work owned by the currently logged in user. Can such a scenario be implemented on Windows?)

Most likely we will go with the first solution. We already know that it works. It's a little extra effort when creating new users, but that's totally scriptable. We plan to do user creation from a single script from now on anyway, so these extra few steps, once incorporated into our user creation script, won't really be a big deal at all. The only other problem with this is that our replica will now need to sync the Windows roaming profile volume as well if it's to work as a proper fallback system. This, too, should not be terribly difficult to accomplish. Overall, this solution is less elegant than the original one, but it should be workable. Hopefully, Windows Vista will mitigate a lot of these problems. (Yes, that was totally a joke. Please chuckle softly to yourself and move on.)

I guess what amazes me still is how contrary to other operating systems the Windows OS is. Everything we're doing here can be done the same way on Mac as it can on Linux, or virtually any other *NIX system: home accounts can be read directly from a network disk, their locations can be specified, and therefore, all this quota nonsense is unnecessary. On Windows, roaming profiles apparently must be downloaded to the client machine (an unbelievably stupid requirement), and the location of said profiles apparently must always be on the root drive in the "Documents and Settings" folder. I guess these are the ways in which Windows continues to force people to use Microsoft products (I can almost hear Bill Gates whispering in my ear, "Wouldn't it all be easier if you just used Windows Server?") But for software that's become the dominant standard in both the business and personal markets, Windows sure seems non-standard in baffling and infuriating ways. Though this may be how Microsoft has managed to stay on top all these years, I still, perhaps naively, believe that some day, if they don't change this strategy, it will hurt them. Frankly, though I'm sure they're out there, I don't know a single sysadmin that likes Windows. Can you blame them?

Three Platforms, One Server Part 5: Away We Go!

So it's the last week during which students have access to the lab, and that means I can finally implement my plan to unify internal network user authentication. Finally! I'm so jazzed. I've been waiting for months (well, years, really) for the chance to do this, and it's here at last.

The general outline of what I'll be doing over the next few days goes something like this:

  • Backup my current Mac server (for safety)
  • Build my master authentication server
  • Backup a clone of the clean server install
  • Configure the new server with:
    • Users and Groups
    • Home account automounting
    • Home account sharing to SMB (for Windows Roaming Profiles)
    • A skel account for Windows users (to live on the home account server)
    • Other share points
  • Create a replica of the new master server

What I did today:
Well, it's amazing how long a base install of Tiger Server can take. I've pretty much been doing that all day. Not that I'm so incompetent that I can't install the software in seconds flat, but software updates take forever and a day. Planning and getting drives to do all this on was a bit of an effort too. Plus I just wanted to make sure I did it right the first time, so I went slow, gave myself the day. I'm also making clones of everything along the way, for building my replica, and in case I goof and need to start over. So that takes a while. I guess I'm just saying that I'm taking my time with this, 'cause I want it to be as perfect as possible from the get-go.

By Monday we should have:

  • A base install of Tiger Server 10.4.6 with requisite Software Updates on a firewire drive
  • A backup of our old Mac Server
  • A new Tiger 10.4.6 authentication server that's configured to host Mac, Windows and Linux users
  • A replica of same

We will spend part of next week pointing all our workstations at the new server. The Windows machines will be the biggest pain as 1) they are running Windows, and 2) they need local quotas set (which could really be just a subset of point 1, but whatever). The reason for all this quota nonsense, you ask? Well, for the answer, you'll just have to read the previous posts on the matter. Suffice to say, I'm hoping the quota setting nonsense will be the worst part of this job, which it should if all goes according to plan, which, I'm sure you're aware, it rarely does.

Finally, I wanted to mention this quote that I read on Daring Fireball, by someone called John Gall, author of Systemantics, as it really jives with a lot of the stuff I've been thinking about with regards to the lab:

“A complex system that works is invariably found to have evolved from a simple system that worked….A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system.”
— John Gall

Next week should be interesting. I'll keep you posted.

Apple Software RAID Tests or: What to Do if You Don't Have an Intel Mac

Yes, I got very excited about what I saw in Boot Camp, though for maybe different reasons than most, and now I'm over it. I don't have an Intel Mac, so I couldn't really do much with the Boot Camp Assistant, but I did get turned on to the latest software RAID capabilities included with the Mac OS. Specifically, two items I was previously unaware of: concatenated RAID and the new resizeVolume verb included with the latest version of the diskutil command. Since I promised I'd report about anything I found, and since I did take the time to perform a battery of tests on all things Apple Software RAID, I thought I'd share what I found, though there's not much here that's very exciting, or perhaps even very enlightening. Still, what follows is a fairly thorough explanation of the various RAID options available in Mac OSX. For PPC, that is.

First of all, software RAIDs cannot be partitioned, which I actually didn't realize, though it does make sense.

Attempting to Partition a Software RAID: No Go

(click for larger view)

The two most common and useful types of RAID are Striped and Mirrored. A RAID stripe combines two drives for increased storage and performance. A mirror combines two drives for redundancy. Here are some details on creating RAID stripes and mirrors on Mac OS X.

Striped RAID

  • Increase in capacity
  • Increase in performance
  • No redundancy
  • Works best with disks of equal size/type
  • Can be used with disks of different sizes
    • Consequences unknown
  • Deletes disk contents on creation

Mirrored RAID

  • Half capacity of combined disks
  • No performance increase
  • Redundancy for disk failures
  • Works best with disks of equal size/type
  • Can be used with disks of different sizes
    • Total size/redundancy is equal to the smaller of the two members
    • Other consequences unknown
  • Deletes disk contents on creation

Most of this I already knew, though I did discover that the RAID mirror can also be set to automatically rebuild in the event that one of the disks fails. Unfortunately, there is no notification for disk failure. So you can automatically rebuild the set, but you have to manually check its status. This is obviously putting the cart before the horse. Kind of stupid. But whatever.

RAID Mirror AutoRebuild: I Didn't Know it was Broken

(click for larger view)

What I did not already know about, however, was something called concatenated RAID. Turns out concatenated RAID (or "JBOD" for Just a Bunch Of Disks) is the oldest form of RAID, and has generally fallen out of fashion. No one uses concatenated RAID anymore. With the current size of hard drives, there's little reason to.

Basically, concatenated RAID does only one thing. It combines a set of drives for increased capacity. There is no performance increase. Concatenated RAID can be used with combinations of drives of any size and can be added to dynamically. If you add a disk to a concatenated RAID, the added disk is erased but the contents of the RAID remain untouched. There is also no redundancy with concatenated RAID, so if you lose one drive in the set, you lose all the data on the RAID. Here's how concatenated RAID works in Apple's Disk Utility.

Creating a Concatenated RAID Set: Data Will be Deleted

(click for larger view)

Concatenated RAID

  • Increase in capacity
  • No increase in performance or redundancy
  • Works with any set of disks
  • WILL delete disk contents on creation
  • Unmounting one member of the set will cause the RAID to fail completely; RAID must be mounted/unmounted all at once
    • Rebooting fixes the broken RAID
  • Adding to the set deletes data on the new disk, but keeps data on the existing set
Updating a Concatenated RAID Set: RAID Data is Preserved

(click for larger view)

Resizing Partitions with diskutil resizeVolumes

Finally, in Mac OS X 10.4.6 there is a new option for diskutil, the command-line version of the Disk Utility application. This new verb is called resizeVolumes and is intended for doing just what it says: taking existing partitions and resizing them dynamically. Unfortunately, as indicated in the comments of this MacGeekery article, though the option is included in the PPC version of 10.4.6, it only works on GPT (GUID Partition Table) formatted partitions which are not normally created by PPC Macs. PPC Macs use the APM (Apple Partition Map) format for their partitions, so on my machine the command failed with the following errors:

Attempting to Resize an APM Volume on PPC: Fat Chance!

(click for larger view)

This appears to be more than just a partition format problem though. The new Disk Utility in 10.4.6 allows for creating partitions with the GUID scheme:

Choosing a Partition Scheme: No Help for the Intel-Challenged

(click for larger view)

But attempting to resize these with diskutil still fails thusly:

Attempting to Resize a GPT Volume on PPC: Nice Try!

(click for larger view)

So, it would appear that, although there are some intriguing new partitioning tidbits in Mac OS X 10.4.6, they are not to be enjoyed by the Intel Mac-less. Alas! Such is life on an educational salary. If I can convince someone to buy me an Intel Mac, I'll surely mess around with this stuff some more. Until then I guess we'll leave the real cutting edge stuff up to the early adopters.

Lucky bastards!

To Repair or not to Repair

I don't really understand why everyone's all of a sudden interested in the Repair Permissions function of Disk Utility. But lately there's been a flurry of writings on the subject of whether or not you should repair permissions before and/or after a software update. It started way the hell back in god-knows-when (actually it was in May of 2005) with an article by a guy called Rosyna who happens to make Unsanity's haxies. This was only recently followed up with two posts by the brilliant author of Daring Fireball, John Gruber. In his second post he blasts MacFixIt for recommending the procedure. And now we have a rather thorough response from them as well.

Why, it's a regular permissions flame war, I tell ya.

Well, I wanted to weigh in on this from my own personal experience, briefly. As someone who manages rooms full of Macintosh computers that are chock full of every imaginable piece of software — both from Apple and third-party vendors — on a daily basis, I have to say, repairing permissions before and/or after a software update — or on a regular basis for that matter — is a perfectly reasonable precautionary measure. I can't tell you how many times a simple permissions repair has fixed an errant problem on a workstation or a client computer. Repair Permissions has saved my ass on countless occasions. And as far as repairing permissions affecting software updates goes, it seems to me that if an incorrect permission can affect overall system or application behavior, it's quite likely it could affect a software update. What happens when the software update package runs and hits a file it's supposed to modify or replace but can't because the permissions are incorrect? Well, seems to me like that could screw things up and leave you with an incomplete install of the update. And that could be bad. This doesn't sound like voodoo to me; it sounds like common sense.

So why doesn't Apple recommend repairing permissions before/after software updates? Well, I have long thought they should. In fact, I think it would be really smart if Apple's software updaters checked the permissions of any dependencies needed by the updater and alerted you to any inconsistencies, then allowed you to choose whether or not to proceed with a repair of the incorrect permissions, and finally proceeded with the update. I doubt this will happen, but a boy can dream. I think the reason Apple doesn't recommend repairing permissions on any sort of regular basis, or before or after a software update, is because they are loathe to admit the generally sad state of permissions settings in the OS. Permissions get changed on a very regular basis. Epson printer software regularly changes permissions on my systems. Flash Player updates do too. And that's just off the top of my head. The fact is, just about any third-party app that installs with an installer (as opposed to drag and drop) can change permissions on any file it wants, and this happens routinely. I've even seen Apple software updates change permissions on my systems (though, admittedly, this hasn't happened for a long time). Basically, Apple just doesn't want to say up front, "Look, this software update might break something, or a third-party app might have changed something that will break something when you install this update, so you should take certain precautions," because doing so would be like saying, "there are huge problems with our permissions model which has gotten much better, but is still seriously flawed in many respects."

So there's an unspoken rule that you should do some general permissions repair on a semi-regular basis. Actually, it's not even unspoken, it's just buried in Apple's Knowledge Base. The fact that the Repair Permissions function exists at all is proof that there is wonkiness in the permissions system, and that running this routine can actually fix stuff. It sure makes a lot of sense to me to perform whatever fixes you can before running a software or OS update, and if that includes Repair Permissions, then that's fine. It takes about one minute of my time and, if nothing else, is well worth the peace of mind it provides.

My question is, why now, guys? Why all of a sudden is this such an issue? The Repair Permissions function has been around for years, and MacFixIt has recommended using it for almost as long. And why John Gruber cares what I do to my system before and after a software update is beyond me. But frankly, I think I trust the MacFixIt guys — and myself — on this one. Don't get me wrong. I love Daring Fireball. It's one of my favorite sites. But MacFixIt — like myself — is in the business of troubleshooting. They/we know about what works and what doesn't. Out there. In the big, bad, complicated world of multiple computers with every imaginable combination of hardware and software, where things most definitely and most decidedly go very, very wrong. I mean, where do you go for troubleshooting advice?

Anyway, I think this whole thing is silly. I respect both these sites immensely, and it pains me to see them argue over such a trivial piece of advice.

Guys, can't we all just get along?