Tiger Lab Migration Part 9: More Problems and Solutions

systemsboy October 23, 2005

Well, when last we visited this issue, we were having problems saving, among other things, Final Cut Pro files. After working long and hard with the Panasas folks, I was confident I'd found a solution. And I had, but the solution has created new problems. I want to kill Apple for Tiger.

To refresh, the original problem was that Final Cut was trying to write files across filesystems, and the OS was choking on this operation. This was happening because our RAID exports its /home volume -- and this is what we mount -- but each user account is also an individual volume. FCP writes to the /home level, and then has to cross filesystems to save the file to the user's home account, itself a seperate volume. Our solution was, rather than mounting all of /home, to mount each user's home account individually. This required a whole bunch of scripting and launchd voodoo to really work the way we wanted, particularly with regards to adding new users. But we got it working, and it fixed the Final Cut problem. I implemented it, tested it, and emailed the community about the fix.

Unfortunately, I recently discovered that using this method causes some new, less ugly, but certainly annoying problems: the sidebar home account link no longer works, causing untold confusion among users who suddenly think their home account no longer exists (as per the alert message that appears when clicking this link); but most alarming, users can no longer open documents from their home accounts. Not only can users not open docs, but root via cron fails to open documents. This is a big problem, as our "Scratch" partition gets deleted every Friday, and the users rely on an alert message to warn them of the oncoming doom. This message no longer launches, and that's a real bummer.

So I was in the middle of drafting an email to the community informing them of the latest bugs and workarounds, when I was struck with extreme embarrassment and shame. How can I continue to expect users to work around these bugs when each week brings a new set? Sure, it's not my fault that Tiger is broken. It's not my fault that the Panasas is set up the way it is. But it is my responsibility to create a user environment that works seamlessly and properly for the user. So I resolved to do everything in my power to implement a solution. I stayed at work on Friday night until 2:30 AM, learning much, but left with no solution in hand.

What I learned on Friday, essentially, is that automount in Tiger is totally fucked. I already knew that NFS in Tiger was fucked in that it can't cross filesystems. But it turns out there have been some major changes to the way automount is handled in the GUI, and thus, for all intents and purposes -- or at least for our intents and purposes -- it is indeed fucked. Hard. The inability to follow sidebar references is a direct result of automount's new Finder behavior: automount mounts in the Finder are now invisible! Unless the path is explicitly called, the user accounts, mounted as they are, are quite invisible, both to the user, and apparently to the Finder's sidebar. I also discovered that the Finder's inability to launch files from the user's home account has something to do with automount, or at least how automount is implemented in the GUI. This behavior only exhibits itself from these invisible NFS mounts; it goes away if we mount the user's home account in /home, as we used to do. And it goes away if we use a different command to mount user homes.

Enter mount_nfs.

The mount_nfs command is used to directly and statically mount NFS exports, and it has its own set of peculiarities. First and foremost, mount_nfs grafts mounts to existing folders, which means that mount_nfs requires a folder named for the user who wants to log in. Secondly, and equally important in our scenario, mount_nfs magically mounts the export in the Finder. Or, rather, shows the mount in the Finder (and this has actually, probably more to do with how the Finder translates mounts requested by nfs_mount). Lastly, since the mounts are static, they remain mounted and appear in the Finder until they are explicitly unmounted. Mounting via mount_nfs is equivalent to mounting nfs exports in the Finder with command-k. It also seems to work, you may notice, completely opposite to automount.

To do what we've been doing -- which is to automount home accounts at boot -- using mount_nfs instead would require us to mount every home account at boot on every machine, and those 200+ mounts would be present in the Finder at all times. This simply would not work. So, mount_nfs at boot: not good. You thinking what I'm thinking?

Enter loginhooks.

To use mount_nfs at boot would be disastrous. What we need in this case is something that will dynamically mount and unmount users' home accounts at login and logout respectively. And that's just what loginhooks and logouthooks do, respectively. Getting loginhooks to work in Tiger was again an exercise in frustration, but this time it was due to my own poor understanding of the technology. There is a great reference on login hooks at Mike Bombich's site, and a decent one at Apple's Knowledge Base as well. Between these two resources, I was finally able to cobble together a solution over the weekend which I think will work.

Briefly, there are three things you need to do to implement loginhooks, and some info you need to know.

The info first:
1) Scripts called by loginhooks run as root (good)
2) They run before login to the GUI takes place, and after authentication (also good)
3) The user requesting login can be represented in your script by the variable $1 (excellent!)

What to do:
1) Write a script to be executed at login, make it executable, put it somewhere accessible
2) Run this command:
sudo defaults write com.apple.loginwindow LoginHook /path/to/loginScript
3) Test the script! This is imperative! You can test it as any user you want by running it thusly:
sudo /path/to/script User

For "User," specify a user who would actually log in, and who is available to the system. "User" here will be interpreted in your script with the $1 variable, just as it will at login. If it works in your test, it should work in practice as a loginhook.

Setting up a logouthook works the same way, except you run the command:
sudo defaults write com.apple.loginwindow LogoutHook /path/to/logoutScript

And, BTW, to ch eck that the loginhook (or logouthook, or both) has been successfully added, you should see your script(s) listed, when running this command:
sudo defaults read com.apple.loginwindow

Finally, to disable, login(logout)hooks run:
sudo defaults delete com.apple.loginwindow LoginHook
and:
sudo defaults write com.apple.loginwindow LogoutHook

So, what I have now are two scripts. The login one creates a folder in /home named for the user who's logging in, and then uses mount_nfs to requset the user's home account from the server and mount it in this folder. The logout one forcibly unmounts the user's home account. And that is all.

I've only tested this at home, but it seems to work brilliantly here, and is fast even over a wireless connection. I will be trying it at work on Monday morning and will post back with my results, success or failure.

Wish me luck.

UPDATE 1:
So far, so good. There were some snags, though. I came in a little early this morning (in hopes of starting before the lab became overrun with students) to implement and test my loginhook plan, only to find the home account server had crashed. So, I had to spend the first hour restarting and troubleshooting the Panasas. By the time I'd finished, the lab was filling up. So I had to do everything piecemeal, and it took a bit longer then I'd hoped. That was snag one.

The next snag was that, after one user logged in, the next user would be locked out of his/her home account. This turned out to be a simple matter of setting permissions (on the /home directory) via the loginhook in a manner in which this did not happen.

There are some advantages and disadvantages to this new mounting method. The greatest benefit is that any changes to the mount method are done via very simple shell scripts, and changes don't require a reboot to take effect. Just change the script and you're done. Another big plus it that, if the home account server goes down, the machines aren't really affected to the extent they once were. If no one's logged in, there's no server access happening, so the machines are fine. Only if someone's logged in is it a problem, but then, that's always a problem. It occurs to me, though, that I may want to add a -hard option to my mount script sometime soon. Another advantage is the fact that, since only individual user accounts are mounted, and not the entire home account server, a command like sudo rm -rf /home can't wipe out the entire server as it could before. The final advantage is that reboots are now much faster, since nothing happens at boot time.

The disadvantages are twofold, and solvable but minor: First, since the entire home account server is never mounted now, users no longer have easy access to each other's home accounts. To access another user's home account, they now must connect to the home account server via the Finder's "Connect to Server..." command (or command-k). Secondly, any user who wants to ssh to one of the other Macs, say, from a laptop, will not have access to his home account since the mount only occurs when logging in to the GUI. These issues are minor, and can be solved by simply mounting the home account server somewhere other than /home, but I think we'll try and live without it for now, as I kind of like not having the whole RAID mounted.

Anyway, no complaints thus far. I'll be convinced this is a go after a week or so without problems. But I'm cautiously optimistic at this point.

UPDATE 2:
Day two, and all's quiet on the Western front, as it were. I'm obsessively monitoring the situation, but it's looking very good at this point. Fingers crossed.

Tiger Lab Migration Part 8: Home Account Woes

systemsboy September 20, 2005

Boy have we got troubles. Sure I tested everything. Sure I did everything I could to make sure this upgrade went smoothly. But does that ever matter? No. It doesn't.

Everything tested out fine. Tiger was able, from day one, to mount our NFS home account server -- a network RAID known as the Panasas. It was able to read files, to write files. It seemed fine. But we'd had our fair share of troubles with the Panasas, even in the beginning, even in Panther. Certain Macromedia products -- actually, Flash MX 2004, to be specific -- had certain functional disabilities. In particular, when any user was logged onto the Mac as a network user whose home account was located on the Panasas, he would find himself unable to publish HTML previews from within the Flash MX 2004 application. The error message was something along the lines that Flash was unable to read the HTML templates, which live in the user's home account. Oddly, on first launch, Flash could write the templates, but for some reason, it could not read them for the preview. If you have any Flash knowledge whatsoever (which I don't -- I was informed by student experts) this renders Flash essentially useless. The problem did not, however, exhibit itself with network home accounts mounted on Apple shares, neither via AFP nor NFS. So the workaround was to create a special account for Flash developers, which they could log into when doing Flash work, and which was mounted directly on the MacServer and shared via AFP. Hence was born our FlashDev account, which has worked fine all year.

Jump ahead to the present: we've upgraded to Tiger. It seems to have gone fine. Suddenly, certain apps begin acting strangely. Illustrator can't save files. Word also has trouble saving files, and behaves erratically, complaining of "permissions problems" and the inability to write temp files to a "network disk." Final Cut, too, begins acting flakey. Shit. It's like one of these sci-fi-horror flicks where the brain transfer experiment seem to have gone perfectly, but then, gradually, the patient begins acting... Different somehow...

And then all Hell breaks loose.

Tracking down all the problems, I finally came to the conclusion that a number of applications are unable to read their preference files in Tiger when those files reside on the Panasas. This perhaps causes a chain reaction that makes these apps unable to properly write or save files: FCP writes 0 byte files; Illustrator CS just can't save your document. Oddly, Illustrator CS2 doesn't seem to suffer from this problem, and all these apps (except, of course, Flash) seem to work properly from within the Panther environment.

Seems to me like a nasty reaction between Tiger and the Panasas home account server.

I've done some testing, and I plan to do more. So far, I've tried booting into a Panther client and found that the problem is greatly reduced in Panther (though it always existed to some extent, particularly, again, with Flash) and exacerbated in Tiger. I've also tried resharing the Panasas NFS export via AFP from the MacServer. I was fairly sure this would provide a reasonable, if temporary, workaround, but it didn't. I want to try it again, just to be sure, but if resharing via AFP does not provide a cure, it would indicate that the fault lies somewhere with the Panasas. The other possible culprit is Tiger's implementation of NFS. To test this, I am building a more generic (non-Apple, non-Panasas, non-proprietary) NFS server to test from, on a Linux install on my new Dell laptop.

So, best laid plans, and all that. What a frickin' mess.

I'll keep you posted as I learn more.

UPDATE 1:
I built a Linux install for the express purpose of testing networked Mac home accounts on a non-proprietary (non-Mac, non-Panasas) NFS mount. When running a Mac home account from this mount, the Final Cut problem disappeared, and the application acted normally. The same was true for Microsoft Word. Illustrator CS still behaved badly, though, as did Flash MX 2004. It's clear now that these problems result both from Tiger's implementation of NFS as well as Panasas's. So, the breakdown goes something like this:

The Flash problem is probably Flash/NFS-specific (though possibly Mac-specific as well)
The Illustrator problem is Tiger/NFS-specific
The Final Cut and Word problems are Tiger/Panasas-specific

None of this makes my job any easier. It kind of sucks to realize that there is no magic bullet, because that means that there probably is no single fix. Still, it's useful to know.

UPDATE 2:
The issue with Final Cut is not a matter of the application being unable to read files; it's a matter of it being unable to write them. Essentially, FCP cannot write a preference file. Launching with no preference file, FCP creates a 0 byte preference file in the user's home account. Also, if you try to save the FCP project file to the NFS share, it will be a 0 byte file and unopenable. Saving the FCP project file to a local volume results in a perfectly usable FCP project file. The complete inability to write files seems to be unique to Final Cut. Other applications (Illustrator, Flash) seem to have trouble reading files, but writing them seems to work fine (which is why I assumed this was the case in FCP). Word (now updated to 11.2) also has trouble writing files, but only when saving an open, modified document. That is, if you create a new document in Word, and hit "Save," it saves just fine. If you keep the document open, make changes, then hit "Save" again, it complains that it can't write to the shared disk. Hit "Save" again, and Word will create a new document and use the first string of characters in the document as the name. "Save As..." however, works as expected.

UPDATE 3:
It occurs to me (through the suggestion of other, smarter folks) that maybe this is not completely an NFS problem, particularly in the case of the apps that have problems writing files (Word and FCP). Those apps work just fine when we mount our homes on a generic NFS export. So it's possible that the write problems we're having are filesystem-related, not NFS-related. This theory holds a lot of water considering the fact that the Panasas uses its own, proprietary file system.

UPDATE 4:
I've been in contact with the Panasas folks, and I'm working with them in trying to determine the problem. So far, I've sent them tcpdump results from both the Panasas and the Mac while FCP tries to save a file. I'm told that there are a "massive number of 'file not found' type errors" from the dump. That can't be good. I've also sent them a few file listings from the home account of the user in question. Meanwhile, I'm wondering if the next Tiger update will do anything to correct these issues, or if upgrading the Panasas shelf would help, while at the same time, trying to come up with a reasonable workaround, in case Panasas is unable to reslove the issue. The only thing I can think of is to move the Mac home accounts to another server. Which would suck hard. Two steps backward.

UPDATE 5:
Still in constant contact with Panasas. I'm sending them log files now, and planning to do a ktrace, which will be a new experience for me. Also, I'm seeing this issue occur with other applications now. In particular, Macromedia Director MX 2004 is unable to write files to the Panasas mount. This is particularly odd because under Panther that application had a completely different problem: it just wouldn't launch. So I'm currently dealing with three seperate incident reports: one for the problem noted here, one for a blade crash that happened a couple of weeks ago, and one for the original Flash problem from our Panther days. It's a pain. Fortunately, the Panasas people are very nice, professio nal, and helpful.

UPDATE 6:
So, this does seem to be a filesystem issue at its core. Sort of. After much work with the Panasas guy, we've narrowed down the problem considerably.

Mac applications write temporary files -- files that are in use but that haven't yet been saved -- to various locations depending on the app. Final Cut writes its temporary files to a folder called .TemporaryItems at the top of whatever volume the user is working from. So in our model, the top volume is /home, and the sboy home account is inside /home. The whole thing looks like this: /home/sboy. Final Cut wants to write its temp files in /home/.TemporaryItems. And it is able to do this. If I look in /home/.TemporaryItems, I find all the files I've been unable to save to my user account, perfectly intact. So, temporary files go in /home/.TemporaryItems, and saved files in /home/sboy. Now, here's the grind: The Panasas setup requires that each home account be a seperate volume. So, /home is one big volume, and /sboy is another volume inside /home. This means that when Final Cut tries to move the temporary files from the .TemporaryItems folder, which is located on the volume /home, to the sboy user account, which is another volume at /home/sboy, it is moving the files across filesystems. And we get an error.

Interestingly, moving files across filesystems when Panasas is mounted on a Linux box produces the same error, but Linux does something to compensate for this problem, and is then able to save the file. But the error still gets produced. It's just that Mac OS X 10.4.2 (this did not happen in Panther) does not have the mechanism to correct for this error. It gives up and leaves the file in its original location: the /home/.TemporaryItems directory.

Also, this explains why everything works fine when I mount the user's home account on my Linux NFS export. In that case, everything is on the same volume, and there is no moving of files across filesystems. It's got nothing to do with NFS at all.

I'm curious to know why this doesn't happen in Panther. Is it possible that Panther has the same error correction function we see on Linux? What about earlier versions of 10.4? Do they have this mechanism? Will 10.4.3 have it? I surely hope so. I'll be looking into who I should contact at Apple about all of this.

Incidentally, the Panasas guys have figured all this out by examining ktrace and tcpdump files made while the program ran and the error occured. I did not figure any of this out on my own. All I did was run the commands and upload the files. Still, it's a nifty bit of sleuthing. I'm learing a lot.

UPDATE 7:
So, after mulling this over for a while, a rather obvious workaround occured to me. Currently, we mount the entire Panasas volume on our systems. This volume includes all home accounts within it, as well as the .TemporaryItems folder at its top level. Each user's home account on Panasas is a separate volume, and so is the top-level volume that contains .TemporaryItems, so saving files must cross filesystems (from the .TemporaryItems folder on /home to the user's folder). The workaround is to mount each user's home account individually. Doing so causes FCP to treat the user's home account as the active volume, and to use the .TemporaryItems folder in said home account. In this scenario, no filesystem boundaries are crossed, and no error occurs. I tried it, and it works. FCP (and Word, incidentally) can now both save files to the user's home account normally, and application preference stick. Final Cut is again usable.

Hallelujah!

I'm testing this on a couple machines before I go and announce it to the community and put it on all the Macs, just to make sure it works properly. But it's looking good. I'm glad to know that the past two weeks I've spent on this haven't been in vain.

Tiger Lab Migration Part 7: Wiggly-Niggly

systemsboy September 11, 2005

Well, we're getting close. We're down to last, wiggly-niggly little tidbits. Here's where we've been in the last few weeks:

The Master System
We have completed a very nice build of Tiger with most of the software we need on all machines, though we're lacking all the most recent Adobe stuff due to not having received the most recent Adobe stuff. 'S'okay. We'll make do with CS1 for now. Our system is all loaded up with our custom scripts, and a new, beefier home account mount script. And it's got a nice, big, 75GB system partition. Why, it's just swell.

We've used this system to clone to all our other lab machines. We boot these other machines in firewire target disk mode, and then we use Disk Utility to restore the Master System to the new machine's system partition. Or, we wanted to use Disk Utility. Turned out that our first Disk Utility-cloned system had some problems due to -- yep, you betcha -- Tiger bugs. The problem was that Disk Utility wasn't blessing the new system partition, so when we'd boot the new machine, we'd get the flashing question mark of doom. So we'd have to then boot off the Tiger install disk, and set the new system partition on the newly built machine to be the startup disk. I decided this was a pain, and that it would be much easier and more instructional to do our clones with a tiny, tiny shell script that calls ASR (Apple Software Restore, via the asr command).

#!/bin/bash

## Script to Clone Local Volumes #### systemsboy - Aug 18, 2005 ##

clear

## Define Variables ##

echo "Please enter the path to the SOURCE volume or ASR-ready disk image...(Example: /Volumes/MyVolume)"read SOURCE

echo "Please enter the path to the TARGET volume...(Example: /Volumes/MyVolume)"read TARGET

## ASR Portion of Script ##echo "Building Clone... Please wait...________________________________"sudo asr -source "$SOURCE" -target "$TARGET" -erase

exit 0

This did the trick indeed, and was as easy to use, if not as purdy, as the Disk Utility application. This method only yielded one problem, and that was that all the invisible folders -- /var /etc /private and the like -- were not made invisible on our cloned systems. This is another Tiger bug, but one that was fixed in 10.4.2's version of Disk Utility. Apparently not fixed in ASR from the command-line, however. This problem yielded both an intersting solution, and some interesting information. Back in the day, these files were made invisible to the Finder by the magic of the .hidden file. The .hidden file was simply a list of files or folders at the top level of the root drive that should be treated as invisible by the Finder. Want to make something invisible on /? Just add it to the .hidden file and restart the Finder. (Ironically, the .hidden file was also at the root level of the system, but was rendered invisible by the fact that its name began with a period.) Nowadays, however -- in Tiger, that is -- this is no longer that case. Files on root are now made invisible, according to this Apple KBase article, by setting certain file attributes. I'll tell you, I have yet to figure out what those attributes are, or how to modify them, but when I do I'll shout. In the interim, I was forced to locate the "SetHidden" program on my original Tiger install DVD. After insering the DVD, it can be found at:

/Volumes/Mac\ OS\ X\ Install\ DVD/System/Installation/Packages/OSInstall.mpkg/Contents/Resources/

I've now copied this file to a folder called SetFile, which I've placed in the Applications folder for future use. Also, I needed the hidden_MacOS9, which can be found, and should be placed, in the same folder as the SetHidden command. So I now have a folder in my Applications folder, called SetHidden, in which the SetHidden command and the hidden_MacOS9 files reside, installed on my master, and I've added the commands:

cd /Applications/SetHiddensudo SetHidden /Path/To/My/Clone/System hidden_MacOS9

to my shell script, and voila! Clone script complete!

#!/bin/bash

## Script to Clone Local Volumes #### systemsboy - Aug 18, 2005 ##

clear

## Define Variables ##

echo "Please enter the path to the SOURCE volume or ASR-ready disk image...(Example: /Volumes/MyVolume)"read SOURCE

echo "Please enter the path to the TARGET volume...(Example: /Volumes/MyVolume)"read TARGET

## ASR Portion of Script ##echo "Building Clone... Please wait...________________________________"sudo asr -source "$SOURCE" -target "$TARGET" -erase

## Set Hidden Files (Tiger Bug Workaround) ##echo "Mac OS X 10.4 does not properly set hidden files.We will now use the SetHidden command to rectify the problem.For this step, the SetHidden folder must be present in /Applications.---------------------------------"cd /Applications/SetHiddensudo ./SetHidden "$TARGET" hidden_MacOS9

exit 0

Cloning all 14 systems on the floor took my Lab Assistants a few short hours.

Spotlight
I mentioned some concerns regarding Spotlight and the particular way in which users use our lab. After some mucking around, and testing of various methods of disabling Spotlight, both forced and optional, I've decided to do nothing. I do this kind of thing all the time. Worry and worry about something, and then, finally, throw caution to the wind and do nothing about it. It's my way. It's part of what makes me so gosh-darn lovable. But I do have two things behind that decision: a rationale, and a backup plan.

The first problem I'd anticipated was that Spotlight would begin indexing user home accounts over the network all at once as soon as everyone logged in, and that this would cause problems with network performance and incomplete indexes. Upon further reflection (read: obsessing), I decided that that Spotlight's network resource usgae would probably be low, and that it would probably avoid incomplete indexes by either continuing in the background after logout, or by resuming at next login. Also, since home accounts have quotas, the largest of which is 7 GB, indexing shouldn't take too long, and any problems should be quickly mitigated.

The second possible problem I'd worried about was that Spotlight's insistance on indexing firewire drives would take forever to yield complete indexes on large drives -- a process that would likely be interrupted at logout before it was complete -- leaving many users with partially indexed drives. In my tests, however, Spotlight picked up right where it left of with drives that were partially indexed when ejected and then re-mounted, so I'm not convinced this will even be a problem. I also worried that Spotlight's firewire drive indexing would cause performance problems that would incapacitate users who were working on video. This may yet be the case. We'll know soon enough.

I came up with a number of possible solutions to these problems, all messy, all imperfect. Ultimately, I decided that the best option was to wait and see what problems arose, and then tackle those problems with the most appropriate of the solutions I devised. Who knows? Maybe it will all work out great without any intervention from me. And that'd be just ducky.

Network
We have an intersting and complex network. It consists of Mac, Linux, and Windows machines, a network RAID, and web and mail storage. And all of this is accessible, in one way or another, via the network. The problem comes when you try to explain all this to new (or sometimes even old) students. It's not an easy picture to paint. And overwh elmed first years usually don't quite catch on for some time. But as we design our network, and build our systems, we're very concerned about how all that is presented to the users. And we try to make it as simple and as understandable as possible, which is, after all, the Mac way, and is why people love the Mac. So let me just say, right here, right now, what the fuck is up with the "Network" view? That piece of shit gets worse with every OS revision. And no one seems to care. I can't find diddly about it on the forums.

So here's the background on this outburst. Back in the Panther years, the Network view offered up a pretty nifty way to present a great deal of our network -- the shared drives of all platforms -- to our Mac users. The coolest part was that it looked much like what you saw on Windows, which simplified things greatly by presenting a consisten network view across platforms. There were a couple things that made this possible. One was SMB workgroup names. SMB does a great job of not only sharing files, but broadcasting itself as a service on a network. Setting up computers under SMB with a workgroup name based on the platform of the given machine got all our Mac, Linux and Windows computers grouped appropriately on the network. And Mac can read those groups and smartly creates folders in the Network directory named after those groups and populated with the systems that belong to them. So in my Network browser, there'd be a folder called "Linux," and all the Linux computers would be in there. Same for Windows. Here's the catch: Our Macs share via SMB to Windows, but we want them to share via AFP to other Macs. Personal File Sharing, which uses AFP, does not, by default broadcast any kind of grouping. If you turn on AFP, you just broadcast the service, but you can't specify it's appearance or where on the client the shares will appear. Computers broadcast via AFP in Panther just show up in the "local" folder, not in a "Macs" folder like their SMB sharing counterparts. There was, however, a sneaky way to define this in Panther, by editing a little file called /etc/slpsa.conf, and thereby magically creating what are known as SLP scopes. In fact, SLP scopes are just that -- a way to group a particular service from a particualr device into logical groups. So we scoped our Macs to the "Macs" scope and they showed up in our "Macs" group in /Network, and all was well in the universe. If you're unbearably astute, you'll probably be wondering why, though, our Macs, when connecting to other Macs via the "Macs" scope in the Network folder, didn't connect to said Macs via SMB rather than AFP. And here was something amazingly cool in Panther. Panther smartly dealt with the folder full of Macs in our Network view. So, when it saw these other Macs broadcasting and sharing over two protocols -- AFP and SMB -- it intellegently chose the best one, which for a Mac was AFP, of course. That was the thing of beauty that allowed us to present the same view of the entire, heterogeneous network the same way on all platforms while still allowing us to connect with different protocols. It was frickin', frackin', goddamn sweet, is what it was. Brilliant. And now it's gone.

Not only is this intelligent decision making gone from Tiger, at least in the latest incarnation (10.4.2 as of this writing), but so too is the ability to define SLP scopes. So, this means two bad things: 1) I can no longer define specific locations for my AFP-shared Macs, and 2) any connection made via the "Macs" folder in the Network folder (which, remember, is now only defined by SMB, not by SLP) connects via SMB; my Macs connect to each other via the Windows file sharing protocol, which is a big fat problem.

So, I'm currently trying to devise a way to scope the Mac shares into some sort of logical grouping that will allow me to present those shares in a palatable way. But it ain't easy. I know you can do this with Tiger Server, and this is probably why they've crippled it on the client, but I don't a Tiger Server yet, and I'm hesitant to build one this close to the beginning of the semester. So we shall see...

The migration is nearly complete. Time permitting, I would abasolutely love to build a fresh new Tiger Server. I'll probably give it a shot in this last week before classes begin, but I'll be sure and keep a clone of the existing Panther Server in case things go awry. Beyond that, and the aforementioned wiggly-niggly, we're done. We're upgraded, on the workstations anyway. And it is good.

There will, most likely, be problems that arise once the students come in and start using the new systems. I will share those in a later post. Until then...

The Busy Time

systemsboy August 31, 2005

Well, it's that time of year again. School is starting. Which leaves us quite busy in the lab, running around, making sure everything looks all purdy, making sure the network and the computers all work, and making sure new users are at least marginally oriented to our way of doing things. Unfortunately, this leaves me little time to post blog articles. In fact, in the short time I've been writing this very article, I've been interupted a grand total of some-odd umpteen times. Yes, it's true. Umpteen.

Anyway, all I really want to say is that I have posts in the pipe: More on the Tiger lab migration, more on antivirus issues, more on my Windows experience. They're coming. Soon. I promise. ('Cause I know you're all just waiting with bated breath.)

Until then, make like a McBLT: keep the hot side hot, and the cool side cool.

I can't believe I just wrote that.

Okay; got's to run.

systemsboy

Automounting NFS Home Accounts

systemsboy August 11, 2005

In the lab where I work, we have networked home accounts for all our Mac users. These accounts live on an NFS RAID on another, non-Apple machine. This, as they say, "took some doin'," but we've had it working very reliably for some time now. It's a neat process, and one I'm rather proud of figuring out. So I thought I'd write a quick (yeah, right!) explaination of what we do.*

General Overview
Generally speaking, in our setup, three things need to happen:
1. The client must be set up to bind to the MacServer with the Directory Access application.
2. The client must automount the NFS RAID at startup so that home accounts are available for the user.
3. The MacServer must authenticate the user and specify where her home account is mounted.

On The NFS Server
I do not administer our NFS RAID. I am not an NFS expert, but I can tell you what I do know:
1. For our purposes, the entire directory containing the user accounts must be exported.
2. Root, I believe, should be mapped to root. It is crucial that the client system have root access to the NFS export.
3. Most typical NFS setups should work withouot a great deal of tweaking, but, if I remember correctly (it's been awhile since we set this up), that last root thing is a deal breaker.

On The Client
The client needs a couple things done to it:
1. The client must be bound to a properly configured MacServer (see below), using the Directory Access application. Most folks who've set up networked home accounts know how to do this. If you don't, read the manual. It's not hard.
2. The client needs to mount the NFS share, preferably at each startup. And this is where the fun begins. Our goal here will be to create a custom StartupItem that automounts our NFS export at each boot.**

For purposes of this example, we'll call our local mount point /home, and our NFS export we'll say lives at the IP address 192.168.1.100 in the folder /Users/Home. (If you're following along at home, feel free to substitute your own values for anything provided in these examples.)

To mount our NFS server, we use a command called automount. automount is sweet, and you can do a lot with it, which we'll get to in a minute. For now, a command you may want to use to test your NFS setup before adding startup scripts and whatnot, is the mount_nfs command, and it looks something like this: IPaddress_of_NFSShare:/path/to/share /local_mount_pointDon't forget to create that local mount point directory first:sudo mkdir /home So, for this example:sudo mount_nfs 192.168.1.100:/Users/Home /home This is a good command to use for temporary mounts of the NFS export. Anything mounted this way will unmount after reboot. Or you can simply use:sudo /umount /hometo umount the NFS share. Now let's get into automount. One of the cool things about automount is that it uses maps to call NFS and other shared disks. Once you've established a startup procedure, you can use maps to add, remove, or change your automount setup. This is handy if you're using scripts (which we are) because it means we really shouldn't have to ever change our scripts. Any changes can happen to the maps and are easy to do. The automount map file looks like this:home rw,net,tcp 192.168.1.100:/Users/Home The first field specifies the local mount point, the second field specifies NFS options (these work best for us, your mileage may vary, but the rw option is necessary and the net option is recommended), and the third field is the NFS export. Place these values in a space-delimited, plain text file, and call it something you'll remember. For our example we'll call it MyMounts. (Do not use a .txt file suffix on this file. Doing so will break all the examples to come.) automount syntax is fairly simple, if confusing at times. It looks something like this:automount -m /mount /path/to/mymountswhere /mount is where the NFS mount will be mounted. The -m flag tells automount to use a map file, the path to which is specified in the second argument to the command. automount then reads the map file, and grafts the home mount point to a symlink inside the directory /mount.*** So, with your NFS server properly configured, and your MyMounts file on your Desktop, if you do this:sudo automount -m /mount ~/Desktop/MyMountsYou should see your home mount appear in the directory /mounts. In Tiger, however, this initial mount point does not show up in the Finder. To see it, you must type "command-shift-g" and type /mount in the text field. Or you can look in the Terminal with:ls /mountYou should also see the mount point listed when using the df command in Terminal. If you don't see the /mount with home inside, something is wrong. You need to troubleshoot your NFS setup. If the share is there, move on to the next step. Once you've got automount properly mounting the NFS export, it's time to create a very simple StartupItem to handle all this at each boot automatically. This is about the simplest StartupItem imaginable. You need three files:1. A simple shell script2. Your MyMounts file3. A StartupParameters.plist filePut these in a folder, which we'll call MountNFS. If you're following along, you have the MyMounts file already, so that's done. Put it in the folder. Next, let's make the StartupParameters.plist file. This file just specifies a thing or two about how the startup item should run, and what messages it will generate. Copy and paste the following text into a plain text file, call it StartupParameters.plist, and save it to your MountNFS Folder:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE plist SYSTEM "file://localhost/System/Library/DTDs/PropertyList.dtd"><plist version="0.9"><dict> <key>Description</key> <string>Automount NFS</string> <key>Messages</key> <dict
>      <key>start</key>      <string>Mounting NFS</string>      <key>stop</key>      <string>Mounting NFS</string> </dict> <key>OrderPreference</key> <string>Late</string> <key>Provides</key> <array>      <string>AutomountNFS</string> </array> <key>Requires</key> <array>      <string>NFS</string> </array></dict></plist>

Finally, we need the shell script, which simply looks like this****:

#!/bin/sh

### Automount NFS Export##

. /etc/rc.commonConsoleMessage "Automounting NFS"rm -rf /homeautomount -m /mount /Library/StartupItems/MountNFS/MountNFSln -s /mount/home /home

Copy this text into a new plain text file, and save the file as MountNFS (it must be the same name as the folder, and it must not end with a .txt or any other file suffix). Make sure it is executable:
chmod 755 ~/Desktop/MountNFS/MountNFS

So, this folder, MountNFS, becomes your actual StartupItem. At this point, you probably want to have this thing run at startup, and the way to do that is to place the MountNFS folder in /Library/StartupItems. (If the StartupItems folder doesn't exist, create it.) You should also set permissions on the MountNFS folder and its contents as well since Tiger will complain (and then kindly fix things) if there are any errors. The permissions should be set so that the owner is root, the group is wheel, and (I think) permissions on all files can be 755, so:
sudo chown -R root:wheel /Library/StartupItems/MountNFS
sudo chmod -R 755 /Library/StartupItems/MountNFS

Once NFS and automount are working properly together, and this StartupItem is in place, all you need to do is reboot your client Mac. You should see your NFS share mounted at /home. (If Tiger complains that the permissions are wrong after the first reboot, tell it to "Fix" the problem and reboot again. It's just making sure the permissions are secure, and if all's good, your StartupItem should work ever after.)

Congratulations! That was the hard part.

On The MacServer
As I said, home accounts for our Mac users live on a RAID which is shared via NFS. Authentication is handled (at present) by a MacServer. Briefly, this is what happens when a user logs in to one of our Macs:
1. When the user types in her username and password, the information is sent to the MacServer, which authenticates the user.
2. The MacServer also specifies where the home account of the user is located on the client machine, in our case, the NFS mount point /home.
3. The client allows the user access to the workstation, and places them in the home directory specified by the MacServer, which, again, is our NFS mount point /home.

This involves a little voodoo on the MacServer. Our MacServer users have their home accounts set in a way slightly different than what is generally done on OSX Server. Usually the home accounts are set to AFP or NFS shares that reside on the MacServer and that get automounted by the client. In this scenario, three fields are populated in the Workgroup Manager's home account settings for any given user. Go to the Home tab for any user, and click the edit button (the one that looks like a pencil) to examine these fields. The first field specifies where on the server the home account lives. The second field specifies the name of the folder for the home account (usually just the user's name). The third field specifies where on the client the home account will mount. In our setup, there is no AFP or NFS share on the MacServer itself, so the first two fields are irrelevant. The only field we need to concern ourselves with is the third field -- the one that tells the client machine where to find the user's home account. And all we need to put here is the absolute path to the mount point of our NFS share, which, by our example, would be /home/username. (Subsequent users can have their home directories indicated by simply selecting the new home location that gets created after setting this up. The "username" is assumed by Workgroup Manger, and does not need to be added for each user.)

That's it. Done.

If you've got all this set up properly, you should be able to reboot your client and log in to your Mac as a networked user whose home account is actually located on an NFS share on another computer. It's what we do, and it works great. And it allows us to centralize our Mac and Linux home account locations. Windows is another story. But we're working on it.

* NOTE: These instructions are for Tiger client authenticating to Panther Server. If details change when we get Tiger Server, I'll post them here.

** There is a simpler, though less elegant way to do all this if you don't feel like creating your own StartupItem. You can edit the existing /System/Library/StartupItems/NFS/NFS script. To do this, add the line:
automount -m /mount_point /path/to/mount_map
at the end of the "Start the automounter" section. This may, however, cause problems in Tiger client as the mount may not show in the Finder. Symlinks can be created here, as they are in our script, to alleviate this problem. The other problem with this is that system updates may overwrite this edit, causing you to redo everything. So I strongly recommend the custom StartupItem method outlined above.

*** Clever readers may notice that this method precludes mounting an export in a top-level directory in /. Unfortunately, using automount, the only way I've gotten it to work is by mounting the share inside the directory specified in the command, so if you want your share at the top level of the file system -- i.e. in / -- you'll have to symlink it. This is what we do. It works fine in Tiger (in fact, in Tiger the initial mount point -- in this case /mount -- doesn't appear in the Finder), but we had problems with this in Panther. In Panther we just used the nested mount point and lived with it.

**** This is the script we use for our Tiger clients. Tiger will not reveal the original mount point specified in the script in the Finder, so we use a symlink to the mount point for our actual home location. This is why you see symlink creation in the script. The first line destroys the symlink before recreating it at boot. If this doesn't happen, a broken link could interfere with the script. And, BTW, the symlink method was unreliable in Panther.

Malcontent Comics Incorporated Presents:

the various and sundry projects of mike barron