Three Platforms, One Server Part 3: Another Quota Solution

Another solution to the problem of quotas occurred to me today: local quotas. Since the Windows workstation copies everything in the roaming profile from the server to the local machine at login, and uses that data as the user works, it actually makes a lot more sense for quotas to happen on the local workstation itself. This way, rather then becoming aware of quotas at logout only, the workstation is always aware of the state of the user's home account quota, because that quota exists on the volume where the profile actively lives.

Doing this also prevents the error message at logout due to exceeding a server-side quota. In this new, local-quota scenario, the user is alerted the instant he exceeds his quota and can rectify the situation immediately. But if he doesn't, no problem. As long as the local quota and the server-side quota match, he'll never run into the problem of being unable to save his settings to the server, since he'll never be able to exceed his quota anyway.

It turns out that, unlike on Mac OS X, setting quotas for NTFS volumes is incredibly easy. In fact, it's a simple matter of viewing the properties of the drive on which you want to set quotas.


Windows Quota Settings: Easy as Pie
(click for larger view)

Here, go to the Quotas tab and click "Enable quota management" and "Deny disk space to users exceeding quota limit," set the limits, and you're basically done. The administrator will have no quotas, and any new user will have the quotas specified in this panel. You may, however, want to set certain users (particularly other admin or local users) to have larger quotas, or none at all. Again, exceptionally easy: Click that badly named, but ever-useful little "Quota Entries..." button and you'll get a list of local users and their quotas.


Windows Quota Entries: Edit Specific User Quotas
(click for larger view)

Here, you can set quotas for specific users. Initially, you'll only see local users. But after any network user logs in, you'll also see them listed as well.

The more I think about it, the more I like the idea of local Windows quotas. Using them does away with all the problems mentioned in earlier posts, and may help with a lot of the quota-related problems users have with our current, Windows-server-based system. Also, this would allow me to store all profile-related stuff in the same place for Windows as for Mac and Linux -- on our home account RAID -- as I'd wanted to do in the first place. And, in general, it should be much easier -- or at least not too difficult -- to maintain.

A last note on the timing of this project: I just spoke with my boss about the fact that I'm way ahead of schedule and have been thinking about implementing our unified password server over the Winter break. Lately I've had some misgivings about doing this, and he echoed those sentiments. His take was basically, "If it ain't broke, don't fix it." Always sage advice. And as gung-ho as I am to see this working, giving it some time and implementing it over the Summer will give me more time to plan this more carefully and test it more thoroughly, so that we can iron out problems at the beginning of the school year -- when students' work is not quite so in-progress -- rather than at the end -- when students are trying to finish their theses. This seems like a wise approach, and at this point I'm leaning toward erring on the side of caution.

Finally, credit where due: In-depth information on NTFS quotas can be found on this Microsoft page, which is where I lifted the images in this post 'cause I'm too lazy to figure out how to get good screen captures in Windows, and 'cause those images described perfectly what I wanted to show. I believe the images are in the public domain and, therefore, legally usable in my post, but if I'm wrong, I apologize, and if I need to take them down, someone can let me know in the comments section.

Three Platforms, One Server Part 2: Windows and Quotas

The Problem
So we've hit a (hopefully) minor snag in our migration to a single authentication server for all platforms. It's, of course, a problem with Windows and its roaming profiles system.

Roaming profiles for Windows are like networked home accounts for Mac. The idea is simple: Your home account is stored on a networked file server of some sort, and when you log on to a workstation this home account is mounted and used for storing files, application preferences and other user-specific settings like bookmarks, home pages, and Desktop backgrounds. It's nice because it allows you to log in to any system on the network, retrieve your documents, and have consistent settings that follow you from computer to computer. On the Mac, the home account is simply a network mount that acts very similarly to a local hard drive or partition. That is to say, your settings are read directly from the server mount as though they were local to the system. This is how Linux works as well. Windows, however behaves, quite infuriatingly, differently.

What Windows does, instead of reading the roaming profile documents and settings directly from the server, is to copy everything in the roaming profile folder on the server to the local machine at login. Then, at logout, the changed files are copied back up to the server. This is a horrific nightmare for any user with more than a 100 MB or so of data in his roaming profile, because as the data grows, logins and logouts take increasingly long as the workstation tries to copy the account data to and from the server. Our user quotas are up to 6.5 GB. So you can see where we get stuck. Because of this copying behavior, Windows roaming profiles just can't handle the amount of data we allow in home accounts for Mac and Linux users. It would choke and kill our network anytime a user logged in. And that would be bad.

This problem has been handled in the past by our separate and discrete Windows Server, which allows assignation of roaming profile quotas. But now we want this to be handled by the Mac Server, which also handles Mac and Linux accounts. The Mac Server, though, doesn't really allow handling Windows accounts much differently than Mac accounts. I can't tell the Mac Server, "Give the systemsboy user a 100 MB quota for Windows, but a 6.5 GB quota for all other types of accounts." It just doesn't work that way. The only difference I can specify is where the Windows roaming profile is stored versus the Mac/Linux profile, and this is going to be a key in the solution to this problem.

The Solution
So, on the Mac Server, the trick will be to specify a separate volume for Windows profiles, and then apply a quota for each user on that volume. Specifying the volume is easy. I've created a partition called "Windows." And I have set the Windows roaming profile to use this volume for all users.

Quotas, on the other hand, are a bit trickier, and I've had to do some research here, but I think I have it. The Mac makes use of quotas using the FreeBSD model, and quotas on Mac Server work just as they do on FreeBSD. Setting quotas is both volume-specific and user-specific. That is, you set quotas per-user, per-volume. And it's not the most user-friendly process.

The first thing to do is to set a volume up with quotas. A volume that has quotas enabled will have these four files at its root:

.quota.user (data file containing user quotas).quota.group (data file containing group quotas).quota.ops.user (mount option file used to enable user quotas).quota.ops.group (mount option file used to enable group quotas)

To create these files you need to run a few commands. The first one generates info needed by the quotacheck command, via ktrace, and outputs it to the ktrace.out file:

sudo ktrace quotacheck -a

We can check our ktrace output, to make sure we have some useful data, with the following command:

sudo kdump -f ktrace.out | fgrep .quota

Which should produce output akin to:

7870 quotacheck NAMI  "//.quota.ops.group"7870 quotacheck NAMI  "//.quota.ops.user"7870 quotacheck NAMI  "/Volumes/Windows/.quota.ops.group"7870 quotacheck NAMI  "/Volumes/Windows/.quota.ops.user"

The next command takes the output of the ktrace file and uses it, through some shell voodoo, to create our needed mount option (.quota.ops) files on all mounted volumes:

sudo kdump -f ktrace.out | fgrep quota.ops | perl -nle 'print /"([^"]+)"/' | xargs sudo touch

Lastly, we need to generate the binary data files (.quota.user, .quota.group) on all our volumes, thusly:

sudo quotacheck -a

Or we can be more selective of which volumes upon which to enable quotas by specifying:

sudo quotacheck /Volumes/Windows

(NOTE: Be sure to leave the trailing slash OFF the end of the file path in this command, lest you get an error message.)

If we do an ls -a on our Windows partition, we should now see the above mentioned files listed. Any volume that lacks these files will not have quotas enforced on it. Any volume with these files will have quotas enforced once the quota system has been activated.

So the second step in this process is to simply turn the quota system on -- to make the OS aware that it should be monitoring and enforcing quotas. To do this we use a command called quotaon. (Conversely, to turn off quotas, we use the command quotaoff.) This c ommand:

sudo quotaon -a

will explicitly tell the system to begin monitoring quotas for all quota-enabled volumes. Again, to be more specific about which volumes should have quotas turned on, use:

sudo quotaon /Volumes/Windows

(Again, mind the trailing slash.)

This setting should be retained after reboot.

Now that we have a volume set up with quotas, we need to set the limits for each and, in our case, every user. Before we can set a user quota, however, the user in question must have a presence on the volume in question -- that is, must have a file or folder of some sort on the quota-enabled volume. So let's create a folder for systemsboy and set the permissions:

sudo mkdir /Volumes/Windows/systemsboy; sudo chown systemsboy:systemsboy /Volumes/Windows/systemsboy

Next we must set systemboy's user quotas. This requires editing the .quota.user and .quota.group files. Since the files are binary data files, and not flat text files, editing them requires the use of the edquota command. The edquota command will format and open the files in the default shell editor, which is vi. If you're not too familiar with vi, you may want to specify a different editor using the env command. To edit the user systemsboy's quota in pico, for instance, use this command:

sudo env EDITOR=pico edquota systemsboy

You should see something like this in your editor:

Quotas for user systemsboy:/: 1K blocks in use: 0, limits (soft = 0, hard = 0)inodes in use: 0, limits (soft = 0, hard = 0)

The first line in the file specifies the user whose quotas are being set. The second line is where you specify the maximum disk space allotted to the user. The hard quota is the actual quota -- the maximum amount of disk space allotted to the user. The soft quota provides an amount above which users might receive warnings that they were nearing their limits. This is how it would work in an all-UNIX world. Since our server is providing quotas for Windows, this warning system is effectively useless, so we don't really need to worry much about the soft quota, but for consistency's sake, we'll put it in. The third line specifies the maximum number of files the user can own on the volume. We're going to set all our users to have a quota of 100 MB, with a soft quota of 75 MB. We don't really need a limit on the number of files the user can have, so we'll leave those numbers alone. Leaving any value at 0 means that no quota is enforced. So here's our modified quota file for systemsboy:

Quotas for user systemsboy:/: 1K blocks in use: 0, limits (soft = 75000, hard = 100000)inodes in use: 0, limits (soft = 0, hard = 0)

There's one last step we need to worry about, and that's how to set these values for every user. Obviously, with a user base of 200+, it would be prohibitively time consuming to set quotas for each and every user with the above method. Fortunately, edquota provides a method for propagating the quotas of one user to other users. The syntax looks something like this:

sudo edquota -p systemsboy regina

where systemsboy is our "prototypical" user from which we are propagating quotas to the user regina. To modify all our Mac Server users, we'll need a list of all users in a text file, and of course, a directory for each user on our Windows volume. We can generate the user list by querying the LDAP database on our Mac Server, and outputting the response to a text file, like so:

dscl localhost -list /LDAPv3/127.0.0.1/Users > ~/Desktop/AllUsers.txt

A simple for loop that reads this file can be used to create the users' directories and propagate quotas to all our users en masse. This should do the trick:

for user in `cat ~/Desktop/AllUsers.txt`; do sudo mkdir /Volumes/Windows/$user; sudo chown $user:$user /Volumes/Windows/$user; sudo edquota -p systemsboy $user; done

Or we can combine these two steps and forego the text file altogether:

for user in `dscl localhost -list /LDAPv3/127.0.0.1/Users`; do sudo mkdir /Volumes/Windows/$user; sudo chown $user:$user /Volumes/Windows/$user; sudo edquota -p systemsboy $user; done

Done at Last
I warned you it wouldn't be pretty. But that's it. You should now have a volume (called "Windows" in this case) with quotas enabled. Every user in the LDAP database will have a 100 MB quota when accessing the Windows volume.

To back out of this process at any time, simply run the command:

sudo quotaoff -a

This will turn off the quota system for all volumes. It will also reset any user quota values, so if you want to turn it back on, you'll have to recreate quotas for a user and propagate those quotas to other users. If you're sure you want to get rid of all this quota junk once and for all, you can run the quotaoff command and then also remove all the volume-level quota files:

sudo quotaoff -asudo rm -rf /.quota*sudo rm -rf /Volumes/RepeatForAnyOtherVolumes/.quota*

Doing this takes you back to spec, and will require you to start from scratch if you want to reimplement quotas.

Final Notes
There are a few things I'd like to note regarding the Windows client interaction with this quota system. For one, the Windows client is not really aware of the quotas on the Mac Server. And since Windows copies everything to the local machine at login, Windows does not become aware of any quota violations until logout, when it tries to copy all the data back to the server and hits the user's limit. At this point, Windows will provide an error message stating that it cannot copy the files back to the server, and that settings will not be saved. In the event that this happens, the user's settings and documents will remain available on the particular workstation in question, and all he should have to do to rectify the problem would be to log back in to the machine and remove enough files to stay under his quota on the server, then log back out. Still, I can already see this being a problem for less network-saavy users, so it's something to be mindful of, and perhaps solve in the future.

A couple other, more general things I'd like to note: There's a nice free utility called webmin which can be used to set quotas for volumes and users via a simple web interface. I tried using webmin, and for setting the volume quotas it was pretty good, but I could not for the life of me figure out how to get it to propagate quotas to multiple users. If you feel a bit put off by some of the above commands, you might try fiddling with webmin, though by the time you install webmin, you could've just as easily done things via the command-line, so I don't really recommend it for this purpose. It's a little more user friendly, but in the end you'll definitely have to get your hands dirty in the shell.

Also, as one of the purposes of unifying our authentication servers is to ultimately simplify user creation, all this is leading to one big-ass user creation script. I have a Mac-only version of this from the old days that should be easy to repurpose for our new Windows user additions. The hardest part of modifying this script will be setting user quotas (though with the edquota -p option, it shouldn't be too bad). Having an understanding of the UNIX side of quota creation, though, will be essential in creating this script, and I for one am glad I took the time to learn it, rather than using the webmin GUI. I really like webmin, and would recommend it for some stuff, but I really like scripting too, and usually prefer a well written script to a web interface. Sure it might be "harder" to write a script. But the tradeoff is speed and the ability to affect a number of things -- be they users, files or folders -- en masse. With webmin, I could create new users, but I could never use it to, for instance, propagate Mac and Windows skel accounts. That would be a separate step. With the proper script, I should be able to do complete Mac/Linux/Windows user creation from one Terminal window, any time, any place. So, again, that's the direction I'm headed in and recommend.

Finally, most of the information regarding quotas on Mac OSX was gleaned from this article:
http://sial.org/howto/osx/quota/

Big thanks to Jeremy Mate, whoever you may be, for the generally clear explanation of setting up quotas on the Mac. I'd recommend folks stop by his site for other Mac command-line and admin hints and documentation as well. There looks to be a lot of good and relatively obscure information there:
http://sial.org/

Three Platforms, One Server Part 1: Planning, Testing, Building

Background
As I've said probably a thousand-billion times, I work in a very multi-platform environment. Our lab consists of Mac, Windows and Linux workstations, all of which rely, in some form or another, on a centralized home account server. Currently, the way this works is problematic in several ways: Each platform -- Mac, Windows, Linux -- authenticates to a different server, and then mounts the home account server via one of two methods, either NFS or SMB. The main problems arise when we want to create new users, or when a user wants to change his or her password. When creating a user, we must do so on each platform's respective server -- i.e. on the Mac server, the Windows server and the NIS server which hosts Linux accounts. The user's short name, UID, GID and password must match across all three servers. Then, if a user later wants to change his password, he must do so on each platform separately or the passwords will not match. There is also an inconsistency in the way the actual home account location appears and is accessed on the various platforms. On Mac and Linux, the home account directories are mounted at /home, but on Windows, the home account is mounted at a location that is separate from the users' roaming profile, adding another layer of confusion for any user who might use both Windows and any other platform.

These problems make user creation painful and error-prone, and discourage regular password changes for the community. Also, educating the community about this system of servers is next to impossible. It's a bad, old system that's long overdue for an overhaul. And that's just what I intend to do.

The Plan
The plan for this overhaul is to migrate all user accounts to one server which will store all user information -- UID, GID, password, name, everything -- and which will also share out the home account server in as consistent a manner as possible. I've chosen Mac OSX Server for this job because it integrates both LDAP for Mac and Linux and Active Directory for Windows into its Open Directory system, and because it's capable of re-sharing an NFS mount out to each platform in a relatively consistent manner appropriate to said platform. There are three major hurdles to overcome in doing this.

The first problem to overcome is building and setting up the new server to host Windows accounts in an acceptable manner. Windows is the real wildcard here. The problem child. Setting up Linux clients is as simple as flipping a switch. Set the Linux client to get authentication info from the Mac Server and you're done. Linux sees the LDAP data just like Mac does and can mount our NFS server directly in /home. No problem. Standard *NIX stuff. Same as it is on Mac. Windows, on the other hand, understands neither LDAP nor NFS natively. So we need to make special arrangements on our Mac Server to accommodate it.

The second hurdle is getting users migrated from all three platforms over to the Mac Server. Here, two complications arise. The first is that my existing Mac Server has never hosted Windows accounts, and therefore, is not capable of doing so using the existing database. So for all intents and purposes, the user database needs to be rebuilt from scratch, which brings us to the second complication: We need to reset everyone's password. What do we reset the passwords to?

The final major hurdle we face is that of redundancy. Since we're now moving all users to a single database, every system is now dependent upon a single server. A single point of failure. We suddenly find all our eggs in one basket, and this is potentially dangerous. If that server goes down, no user can login in and work, and we're dead in the water. So we need to figure out a method for keeping a live backup of our server that can take over if it goes down.

These are the major hurdles, and there are some minor ones along the way, which we'll talk about as we get to them. So let's start building that server. Our authentication server. The mother of all servers.

Building the Server
I started fresh with a Tiger server. Nice clean install. I set it up to host Mac accounts, as usual, using the standard LDAP stuff that's built-in, as I always do. Master server, and all of that. But this time I've also set the server up as a Primary Domain Controller (PDC) so that it can also host Windows accounts. Getting the server to host Windows was not particularly difficult, but getting it to host them on a re-shared NFS mount, and having the Windows roaming profiles live on that NFS share was a bit more tricky. To do this, I first mounted our NFS export on the Mac Server at /home. I then set /home/username to be the location for our Windows roaming profiles. This should theoretically work. But the Windows client wants to copy everything in the roaming profile directory to the local computer at login, and then copy it all back at logout. This is a problem for any user who might have more than a few megabytes of data in his home account as the time it takes to do the copy is painfully long. Also, Widows ran into permissions problems when it hit the Spotlight directories, and errored out. So I've created a special directory in the home accounts, called Windows, that only holds the roaming profile data, and set the server to have Windows clients look here for said data. This initially failed to work until I realized that roaming profile data must actually exist for Windows clients to understand what's going on. Once I copied some basic roaming profile data (skel data, essentially) to the user's Windows folder, the Windows clients were able to login using this folder for the roaming profile. I'm almost surprised that this worked as we're essentially re-sharing an NFS mount out as SMB from a Mac Server to Windows clients. But work it did, and so far, at least, I've encountered no problems here. Windows users will now use the same home account location as Mac and Linux users. This is extremely cool.

So, on the Mac server we've mounted our home account RAID at /home, and in the Windows section of the user account, we have a roaming profile path that looks something like this:
\\ServerName\home\UserName\Windows

And in that Windows directory we have some basic Windows skel data. And that's pretty much all we need to host our Windows accounts.

Migrating Users
Next we need a plan. How will users access this new system? We'll have to reset their passwords. To what? And how do we inform them of the change? And how do we actually migrate the users from the old system to the new?

To answer the last question first, I decided that I wanted to start fresh. No actual user migration would occur. I want to build the new user database from scratch, since we'll be needing to reset all passwords anyway, and just in case there are any inconsistencies with our old server's database. Which brings us to the how of this, and that will answer the question of password resets as well. What I decided to do was to create a record descriptor -- a text file that can be imported into the Open Directory database on the Mac Server -- using easily obtained and formatted data. All I really need to do this is a list of users, their long names, their short names, their UIDs and GIDs, and their passwords. Since we're resetting their passwords, we can actually use whatever we want for this value. But it makes the most sense to use, for the reset passwords, something that is already in the possession of the users. This way, disseminating passwords is far less of a chore. Each user is given an ID card by the school when they start . This ID card has a number on it which we can use for the new passwords. Building this record descriptor is a fairly simple matter of obtaining the aforementioned data and tweaking it a bit. This can be done using a combination of a basic text editor, like TextEdit, and something like Excel for properly formatting the data. I will leave the details of this process for another exercise. Suffice to say, we have a clean slate upon which to build our user database, and a good method for reassigning passwords. So this step is effectively done.

When it comes time to implement the new system, users will be informed of the password change via email, and signs posted throughout the facility. And if they want to change their passwords after that, it will be as simple as logging onto a Mac and changing the password in the Accounts System Preference pane.

Redundancy
Finally, we have the problem of redundancy, and this is fairly crucial. Having all user data on a single machine is far more convenient, but also has certain risks associated with it. First and foremost among these is the idea of a single point of failure. That is, if our user account server -- our Mac Server, which in our model is soon to be responsible for user authentication for every workstation in the lab -- fails, no user can log in, and therefore, no user can work. Classes cannot happen. We're screwed. Hard. Ideally, we'd have a second server -- a replica, if you will -- of our primary Mac Server. If our primary went down, the replica would automatically step in, and take over the task of authenticating users until the primary is running again. Well, such a thing is indeed quite possible, and is also built in to Mac OSX Server. It will be crucial for us to set this up and make sure it works before we implement our plan.

What's Next
So, this is where we are at this point. I've built a model Mac Server. I've begun compiling the user data into a record descriptor (actually, all students have been added, but I need faculty and staff data to complete the job). I've tested the behavior on test Linux and Windows machines. The next steps will be: more thorough testing of the above, setting up and testing the server replica behavior, and, finally, alerting users to and implementing the new system. These will be covered in upcoming articles. If you're interested in such things, do check back. I hope to have all this completed by the end of the Christmas break. But then, you never know what sort of problems will crop up in a move like this, so we'll see.

Either way, I'll be documenting my experiences here. More soon to come.

Spotlight Problems with Networked Homes

Well, I had a feeling this would be a problem. Spotlight is having trouble with our networked home accounts. Actually, that may not be entirely accurate. Spotlight is probably also having problems with the sheer number of users in our networked lab. Basically, the Spotlight indexes on user accounts, which reside on an NFS mount, are failing all over the place. We've got mds crash logs on all our systems as a result of these failures, and mds tends to freeze a lot and cause Finder beachballs. I think what's going on here is this:

User logs in. Spotlight begins to index user's account. User plugs in firewire drive. Spotlight begins indexing that. User then unmounts firewire drive, and logs out. Spotlight's all like, "What the fuck? Where'd everybody go. I'm crashing now." And that's when we see our mds crash log.

On my own personal networked home account, this whole state of affairs has rendered Spotlight completely useless. As in, I can't search for shit. As in, command-space, search-for-term yields nothing, anywhere, ever. Using Spotlight on a local account on the same machine works fine, though. So clearly my Spotlight databases are okay, at least the local ones. And clearly there is something about networked stuff Spotlight does not like.

My understanding was that Spotlight was supposed to never index networked volumes. So believe me, I was surprised to find it indexing our home accounts. Perhaps, since they're mounted via NFS, it was unaware that they were network volumes (which would be incredibly stupid). I don't know. But it sure is confused now.

Ironically, now that Spotlight has become completely useless in our lab, it's gone from being one of the most excitement-generating features of Tiger, to one of the most ignored here. Once again, it's lack of configurability, it's premature release, and Apple's continued ignorance of multi-user environments has given this product a head start on the road to obscurity. Remember how cool we all thought Sherlock would be when it started taking after Watson? But then Apple just ignored it, and now it's gone the way of the Dodo. (Does anyone ever use Sherlock anymore?) I don't think Apple will do the same kind of thing with Spotlight. It's got way too much potential, and way too much press to let languish. But if they do with Spotlight what they did with Sherlock, it will wither and die, as has Sherlock, and as I can already see Automator doing. (Is anyone using Automator these days that didn't already use AppleScript?)

What all these technologies have in common is that 1) they rely on third-party developers to develop for them before they really become useful, and 2) they are essentially there to make our lives easier, not harder. If either of these things doesn't come through, the product stagnates. Of all these technologies, Spotlight is the least relaint on third-party development to be really useful. So its incumbent on Apple to really see the product through. Apple needs to get to work on this, and fast. They really need to make Spotlight work better, and they really need to make it way more configurable. It needs to work better for the folks who don't want to configure it -- the folks who just go to Google and use the regular search field. And it needs to be more configurable for the power users, the lab admins, and the folks who use Google's advanced search.

As an admin, all it's done for me so far is create extra work. I'll be spending this week devising a method for turning off Spotlight on our networked home accounts. And that's not exactly what I'd call productivity enhancement.

Tiger Lab Migration Part 10: Up and Running

Take it as a symbol of my incompetence. Take it as a sign of the inherent difficulties in unifying any highly cross-platform environment. Take it as an indication that major OS updates in said environments are often far more difficult than you might have ever guessed. Take it for what you will. I'm just happy and relieved to report that, after many months of difficulties, we seem to be out of the woods: Our Tiger Lab Migration is complete.

(Note the sound of me frantically knocking wood.)

There are several things that have transpired -- things mostly out of my control -- that have made this migration, at last, complete. If you recall, there were two initial major problems from which we were suffering at the outset of the migration: 1) Mac OS X 10.4.2 was unable to write files across filesystems, which was impeding our use of certain software, particularly Final Cut Pro, which is used heavily in our lab; and 2) our Panasas brand home account RAID (on which we host our networked Mac home accounts via NFS) was crashing, almost daily at the end there. Problem number one was mitigated by certain workarounds I was able to implement using mount_nfs and loginhooks rather than the preferred automount method for mounting home accounts on our client workstations. Problem number two had both myself and the Panasas engineers scratching our heads.

Ironically, both problems are solved by system updates.

The inability of Mac OS X 10.4 to successfully write files across filesystems is cured, thankfully, with the latest update to the OS: 10.4.3. I have not yet implemented 10.4.3 lab-wide as yet, and so we're still running my mount_nfs workaround on the client systems, but on my admin machine I am testing the new update, and so far so good. Things behave properly again. I will also be running the update and the automount method on a test system on the floor for good measure, but I'm reasonably confident that this issue can, at last, be put to rest.

The home account server crashes also seem to have been cured by updating the Panasas RAID OS software. We upgraded on Friday, November 4th, 2005, and have not had a single crash since. This could be mere coincidence, but I'd be pretty surprised if that were the case. The crashing seems to have stopped, and I'd bet a fair chunk of cash that the update is the reason why this is so.

The best part of all this is that, for the past week, the lab has been quiet. People come in. They log on to the systems. They do their work. And everything functions properly, without incident, for the first time in months. The knocks on my door, the panicked moments, the hair-pulling-filled long nights and weekends have all subsided. My stress levels are slowly returning to what they once were, which is not to say normal, but rather normal for me. It is truly a thing of beauty.

There are two things a SysAdmin likes best in the world: a good challenge, and successfully overcoming said challenge. The Tiger Lab Migration has afforded me a hefty dose of both. And despite all the trauma, it's been an enlightening and illuminating experience from which I've learned a very great deal.