Take it as a symbol of my incompetence. Take it as a sign of the inherent difficulties in unifying any highly cross-platform environment. Take it as an indication that major OS updates in said environments are often far more difficult than you might have ever guessed. Take it for what you will. I'm just happy and relieved to report that, after many months of difficulties, we seem to be out of the woods: Our Tiger Lab Migration is complete.
(Note the sound of me frantically knocking wood.)
There are several things that have transpired -- things mostly out of my control -- that have made this migration, at last, complete. If you recall, there were two initial major problems from which we were suffering at the outset of the migration: 1) Mac OS X 10.4.2 was unable to write files across filesystems, which was impeding our use of certain software, particularly Final Cut Pro, which is used heavily in our lab; and 2) our Panasas brand home account RAID (on which we host our networked Mac home accounts via NFS) was crashing, almost daily at the end there. Problem number one was mitigated by certain workarounds I was able to implement using mount_nfs and loginhooks rather than the preferred automount method for mounting home accounts on our client workstations. Problem number two had both myself and the Panasas engineers scratching our heads.
Ironically, both problems are solved by system updates.
The inability of Mac OS X 10.4 to successfully write files across filesystems is cured, thankfully, with the latest update to the OS: 10.4.3. I have not yet implemented 10.4.3 lab-wide as yet, and so we're still running my mount_nfs workaround on the client systems, but on my admin machine I am testing the new update, and so far so good. Things behave properly again. I will also be running the update and the automount method on a test system on the floor for good measure, but I'm reasonably confident that this issue can, at last, be put to rest.
The home account server crashes also seem to have been cured by updating the Panasas RAID OS software. We upgraded on Friday, November 4th, 2005, and have not had a single crash since. This could be mere coincidence, but I'd be pretty surprised if that were the case. The crashing seems to have stopped, and I'd bet a fair chunk of cash that the update is the reason why this is so.
The best part of all this is that, for the past week, the lab has been quiet. People come in. They log on to the systems. They do their work. And everything functions properly, without incident, for the first time in months. The knocks on my door, the panicked moments, the hair-pulling-filled long nights and weekends have all subsided. My stress levels are slowly returning to what they once were, which is not to say normal, but rather normal for me. It is truly a thing of beauty.
There are two things a SysAdmin likes best in the world: a good challenge, and successfully overcoming said challenge. The Tiger Lab Migration has afforded me a hefty dose of both. And despite all the trauma, it's been an enlightening and illuminating experience from which I've learned a very great deal.