Tiger Lab Migration Part 4: Spotlight Worries

Here's an interesting little gotcha. Not sure if it's a good thing or a bad thing yet.

Probably bad.

In our lab, home accounts are on a server. Or, more accurately, they're on an NFS mount that is shared from a network RAID. The way this works is fairly simple, but kind of tricky. The Macserver handles the authentication for network users who log into client workstations. The clients always have the NFS RAID mounted at /home. When users log in, Macserver specifies their home accounts as /home/username. To make sure the RAID is always mounted, we use a little startup item that's just a very simple automount script to call the RAID and mount it in /home.

There's one other little thing that's weird about our setup, and that's the way the RAID is configured. Our RAID is a very nice, but proprietary system made by a company called Panassas. The way user accounts are created on the RAID is unique, and I don't fully understand it, as I did not set the RAID up, nor do I maintain it. But essentially, from my understanding, each user's home account on the RAID is a separate partition.

Does anyone see the problem here?

Well, I won't keep you in suspense. If you haven't figured it out, here's the problem -- and the more I think about it, the more I realize that it is a problem and not a boon: Whenever a user logs in, a new partition is mounted via NFS. And guess what happens then. You guessed it (or maybe you didn't): Spotlight starts indexing.

Holy fucking shit.

I have 207 users currently on the RAID. They have quotas between 2 and 7 gigs. And they're mostly completely and totally unaware of Spotlight and its idiosyncrasies. If one of them logs in and then, say, shuts down the machine (it will happen, trust me), the Spotlight index will get hosed and the machine will most assuredly begin acting flakey. Or how about this: What if a user logs in, checks his email, then logs out? Then what if another user logs in, does same, and logs out? What if five users come along and do this? Now we've got spotlight indexing five different network mounts on the same machine. What if that machine then gets rebooted before indexing is complete? I shudder to think. Or, what if a user logs in to a machine, indexing begins, she logs out and then logs into another machine? Now Spotlight is indexing the same mount point -- accessing the same database -- from two different machines. I'll say it again: Holy fucking shit.

This is a recipe for disaster.

Fortunately, I am an expert in the various methods for turning off Spolight. And that's what we'll have to do: turn off Spotlight for all 207 mount points. This isn't really that big a deal. One simple command should do it (I hope). But it gets me thinking about all the various other Spotlight related problems we're bound to encounter. For instance, our users do a lot of video, and they're encouraged to use firewire drives for this. Well, firewire drives are indexed as soon as they're mounted. I have no way to change this. What happens if indexing on a firewire drive is stopped (i.e. the user unmounts his/her drive) before it's complete? Now we have a hosed index on the user's firewire drive. Next machine he/she goes to will try to index the drive again, possibly completing the index, possibly not. And, during the indexing period, will performance drop to levels that do not permit video editing? I just don't know. But if they do, it's going to be a real problem. And just how do you educate 200+ users about this? It's way over most people's heads. This is a technology that is supposed to "just work." Unfortunately, in a multi-user, networked environment like ours, my worry is that it will "just break."

I'm feeling very hesitant about this migration. It wouldn't be the first time Apple's plans for the home user have come at the expense of the networked lab user. They often seem to forget about us, even though, in many ways, it's this sort of environment for which OS X is so great. Ironic. But if you think about it, one of the greatest features of Tiger -- Spotlight -- is completely useless in a networked environment. In fact, Spotlight is not even supposed to run on networked volumes. (Why it does on ours, I do not know, though I suspect it's because we're using NFS.) But the firewire thing is really disturbing, and I think really underscores the need for significantly more control over the behavior of Spotlight. If, in the Spotlight Preferences, there were a checkbox for "Disable Spotlight Indexing on External Drives," I'd be the happiest man alive right now.

As it is, I'm just plain worried.

UPDATE:
Another thought occurs to me: Okay, so I disable Spotlight on all 207 accounts. Well, what happens when we create a new account? Spotlight needs to be turned off for that account too. So basically, what this amounts to now, is a script that gets run at least every time a new user account is created -- possibly at every login, just to be safe -- that disables Spotlight on all mounted home accounts.

Oh joy.

Tiger Lab Migration Part 3: Radmind

Okay. So, Tiger client is working. Moreover, Tiger client seems to work with my Panther Mac Server. And I have a backup disk image of my working Tiger install, should anything go amiss. Time to start setting up Radmind.

The Logic:I have about thirty Mac systems to maintain in a lab set up for art students doing all manner of computer-based art, including: web design, graphics, video, audio, interactive authoring (from screen-based to installation art), and some 3D. Certain software -- like the operating system, for instance, is installed on all the machines. Certain software -- like Max/MSP, for which we have only so many licenses -- is installed only on select machines. Also, some systems are workstations used by students for the creation of their work, but there are also staff machines which serve vastly different purposes and are set up very differently. This means we have multiple hardware/software configrations for the various systems in our department. Keeping these machines up to date can be very challenging. Not only do we need to keep tabs on which systems have which software, we also need to keep tabs on which systems have been recently updated and which ones are in need of being updated. In the past, this has meant keeping a database of system configurations, logging into or polling (via ARD or some such utility) systems to see which ones need updates, and, when updates are required, personally sitting at machines and running software updates by hand. This process is tedious, inefficient, and more importantly, quite error-prone: Do one thing differntly on one machine, and you've suddenly introduced inconsistencies throughout the lab. And with no way to track them, or even revert them should the need arise.

Clearly what is needed is a centralized system for software and OS update management and reversion, whereby changes to be made to workstations on the lab floor can be applied to a single system, tested, and then propagated to the appropriate systems during off-hours or scheduled maintenance times. Radmind is such a solution. And, miraculously, it's completely free.

Wow.

The Goal:We have several different configurations of Mac on the lab floor, and in the various staff offices. Before we start, let's outline them: We have Basic Workstations (BWs) with a basic (though still quite large) set of software; we have what I'll call Max Workstations (MW), which are essentially the same as the BWs, but with a Max/MSP added; we have Physical Computing Workstations (PW), which have the Max/MSP config plus certain drivers required for programming Basic Stamp and the like; we then have Staff Workstations (SWs), which have a leaner software set overall, but which also have software not generally found on the public workstations; next, we have Audio Worksations (AWs), which have the basic set plus -- you gussed it -- audio software and drivers; and finally we have Video Workstations (VWs) that are set up like the Basics, but with a few video do-dads to boot.

A quick note about the Audio and Video workstations: I share maintenance of these machines with our A/V SysAdmin. Essentially, he manages them, but I provide him with a baseline OS install (by way of some sort of cloned image) and advise him with regards to OS updates and the like; he installs, configures, manages and updates any audio- or video-specific software not found in my base config. This area of the lab will be tricky to control with Radmind, particularly the Audio Stations as they often require certain hardware to be available for the software to be installed. Also, since our A/V guy handles upadates to those systems, and since I do not (and this is a good thing), using my Radmind system for A/V updates might prove tricky. I will save the A/V systems for last, and figure out how best to handle them later. Fortunately, Radmind allows for this sort of gradual implementation. Ultimately, though, I may leave them out of my Radmind setup.

The ultimate goal will be to set up one alpha workstation that has everything required for every configuration. Then Radmind can be used to create subsets of this uber-station (henceforth referred to as the Master Client or "MC"), for propagation to workstations with less than the maximum of software installed. The MC will be built around the configuration of the Physical Computing Workstations, as those systems have all the software needed on any other system.

The Process:The first step is to set up the Radmind server, which will be my admin box. That's really easy, and is done. It's simply a matter of downloading the Radmind software packages, and then running the Radmind Assistant application. These can be found here. In the Radmind Assistant I just set my system up as a server. That's all. Oh, and I made sure to set it to use Bonjour for discovery. This makes server discovery from the client a breeze. The client simply looks on the network, via Bonjour, for any Radmind servers, and when it finds them it gives you a list of available server IPs to chose from. Nice.

The next step is a bit more complex. It's time to build the MC. To begin, I am doing a fresh install of Tiger, running all current system updates (were on 10.4.1 as of this writing), and then setting this base install up the way I want it. A surprising amount of stuff gets set at this stage: Network, Energy Saver, Sharing, QuickTime Pro (license and settings), Accounts, and Security preferences all get set here. I am setting up my two local admin accounts at this stage as well. Also, I'm setting up binding to my Macserver for authentication, as well as installing my custom NFS mount Startup Item for mounting our home account RAID. Finally, I will install Radmind, of course. What I want to end up with is a very basic, clean system that represents the bare minimum installation for running in the lab, with no third party apps yet installed, and no customization of cron or any login scripts or anything, except Radmind, which should be set up as well, and should be part of the Base Install on the server so that it can create Radmind-controllable clones of itself, which can then be easily updated. This is my Base Install.

Once the Base Install is done, I will configure the machine to be a Radmind client that is controlled by the Radmind server. The MC will doesn't know it's the MC, and it doesn't really have to. In fact Radmind doesn't even need to be aware of this. The concept of the MC is really for us humans. So the MC will be set up as a client, just like any of the other clients. Essentially, Radmind will then begin using this machine to set up lists of files. These lists are what's really important. The lists will be used to compare files on various clients. Additional clients will be updated based on these lists.

At least that's how it's supposed to work.

Failure:I spent an entire day attempting to set up my MC with Radmind. You need to do two things on the MC before you can really get down to business with Radmind: 1) create a negative transcript for the server to use, and 2) create a positive transcript for the server to use. These are the most basic, fundamental lists that the server uses to compare against clients. The negative transcript is a list of files that should not be propagated to clients, and the positive transcript is a list of files that should get propagated. For some reason, I had endless problems creating these transcripts. The first problem was a discrepency between how the GUI application creates transcripts, and how it reads them, by default. The GUI app is set to "Begin transcript comparison from this path: / (slash)" whereas, the default transcripts created by the application use ./ (dot slash) at the beginning of their file paths. So, right out of the box, the Radmind GUI fails horribly, and my first, vanilla, Radmind-built negative transcript generated all manner of error message. Changing the defaultsfixed it, and I was able to generate a useable negative transcript.

The second problem... Well, I don't know what caused the second problem. Basically, I can't seem to generate a positive transcript that will verify without errors. All I'm trying to do is create the base-loadset.T transcript, using all the defaults in Radmind, and each time I do, on my server I get a list of positive transcripts with numbers like "994" appended to the file names, and my base-loadset.T file generates an unspecified error when I try to verify it on my server. I've tried this numerous times, and the same thing happens each time. Frankly, I'm sick of it. Each base-loadset.T creation takes upwards of half an hour, as the client must compare and then copy the entire loadset (essentially, all the files on the hard drive) to the server. Multiple failures at this stage are infuriating. But what's worse is that there seems to be no way to modify the configuration once it's been uploaded. Making a new loadset with the same name gives me the error "Loadset exists." So the only way to re-attempt loadset creation, or modify a loadset, is to erase it and start over. For a system that's all about monitoring and tracking changes to systems, this seems like a backwards approach. In any case, after a day of trying, I still have not successfully created a working positive transcript.

There are lots of problems with the Radmind implementation. One irksome problem is the inconsistency of just about everything in the application. (I'm talking about the GUI here.) For instance, running through the setup steps frequently yielded different results, both on the server and on the client -- sometimes I'd get errors, go back, repeat and get no errors; sometimes, after setting up my negative transcript, I'd be asked to set up my positive, sometimes I wouldn't. Also, the interface is ridiculously inconsistent: while running the setup steps, pressing the "Go Back" button does not take you back to the beginning of the setup steps. WTF? Maybe they should rename that button "Go Somewhere You've Never Been Before," because that's where it takes you. Another issue I faced was in altering a transcript: the latest version of the Transcript Editor completely garbles your transcript if you add an item. I had to use a previous version to add items to the list. And there's more: Adding a server to your server list in the Radmind preferences does nothing apparently. Even after doing this I was always queried for my server IP. Also, once added, a server cannot be removed from the list. There is a "remove" button, but it does nothing.

This is why Systems Admins should never design software.

It's been suggested that I try using Radmind from the command-line. I am tempted. But the problem is that the Radmind CLI environment and implementation is so complex that I'm liable to spend a week learning it, only to find that it still doesn't work. I've already been online reading various Radmind mailing lists, and people are having all manner of difficulty there as well, particularly going to Tiger. I just don't think, at this stage, it would be wise to continue with this plan when the product is clearly so problematic on so many levels.

Resignation:So, after all this planning and testing, I've decided not to use Radmind after all. My reasoning is basically twofold: 1) Radmind is supposed to make my life easier, not harder. Thus far it's only introduced complications to my life and to the process of administrating my systems. And this is just in setting up my base system. What problems will I encounter when I start adding the many gigabytes of application files? Seems to me a product that is designed to simplify lab management should be fairly straightforward and easy to master. If it's not, what's the point? Using Radmind only adds an extra layer of complexity that I'm not even sure I really need. 2) Radmind is supposed to make updates and installs less error-prone and more consistent throughout the lab. But, again, the Radmind process itself is inherently error-prone, at least in my (and many others online) experience. How can I rely on such a system for lab maintenance with any confidence at all? I simply don't trust it. And if Radmind breaks with each upgrade of Mac OSX (which it might or might not, I just don't know, but indications are that it does), then again, what's the point? For all my work, what do I get?

There's got to be a better way.

Seems to me like the ultimate Radmind solution, at least in GUI-land (or maybe even as CLI solution -- why not?) would be something quite seamless to the admin. (And yes, I will now try to outline what I would like to see in a Radmind-like solution despite having said, not three paragraphs earlier, that Sys Admins should never design software.) I envision something like this: There is an interface called "Base Builder." Here you configure your base system, which would be your Master Client. On the MC you open Base Builder and tell it to use the "Current System" as your base install and it uploads everything to your server. I don't need to see a list of files at this point. Just build the damn system. Keep your lists to yourself, thank you. After the system is built, you can create your "Exclusions" from within the Base Builder app. This is real simple too. You just drop the folders you want to exclude into a GUI window, and then the properties of each exclusion. Now you've altered your base install, so you get a window that says something like, "Base install has changed. Would you like to rebuild the base install on the server?" and a big, fat "Yes" button. Hit "Yes" and your changes are propagated to the server. Simple. And when it's time to add applications or other layers to your base, you go to the "Layer Builder" interface. This is similarly easy to use. Here, you tell the app where to find your base install, or you can say "User Current System." Layer Builder will take a snapshot of the base install and keep that snapshot. You'll install your apps, then tell Layer Builder to create the new layer from a comparison between the base install snapshot and the new system. Layer Builder will allow you to name your new layer, and then save a snapshot of the layer. Finally, you'll have a "Sets Builder" interface. Here you can combine various sets of layers to create different configuration for different machines. This would have three panes: A "Layers" pane, a "Sets" pane and a "Computers" pane. You'd drag layers from the Layers pane into sets in the Sets pane to create your various configs. Then you'd drag sets to computer lists in the Computers pane. In the Computer pane you could run "Compare" to see differences between the actual computer's files and the files in the set. And if there were differences, you could propagate them to the client using something like, oh, I don't know, an "Update" button, let's say. And that's it, basically. For advanced users, you could look at the file lists and make changes between the server and the MC and the various clients. This seems like a key missing feature in Radmind. The ability to change the base config, or any of its transcripts in any meaningful way seems to be absent. These sorts of changes require rebuilding everything, which takes a great deal of time and effort, and is error prone. And isn't the whole point of this to make the process of lab maintenance easier and less error prone?

I realize that what I've described is essentially what Radmind is and does. But Radmind does it in such an abstract and confusing way that the process becomes needlessly complex and defeats its own purpose. Something that companies like Apple understand -- and this is a large part of why I prefer the Mac platform -- is that good visual design and clear language can make an interface, or even a CLI app, a breeze to use, and that that is actually the point of GUI applpications: To make a difficult and confusing process clear and intelligible. Radmind's visual interface is a mess, and its language is dizzingly obscure. Here's a list of termi

nology, for example: negative transcript, positive transcript, command file, loadset, base loadset, overload, configuration. Here is a list of some of the files involved: negative.T, base-loadset.T, base.K. The files that end in .T are transcript files, which are lists of files belonging to a loadset. Get it? Of course not. Who would?

It's totally ridiculous.

Now that I've gotten that off my chest, I need to come up with a good way to proceed with my lab update. And although I am scrapping Radmind for now, I would still like to think of a way to ease future updates and remove the inconsistencies from the way I've done things in the past. There are a few options here. These involve disk images and databases for the most part. In any case, I will be giving this some serious thought as I move forward. But these issues will be the topic of a future article.

Tiger Lab Migration Part 2: Client/Server Interaction

As foretold, I've rebuilt my admin machine with fresh-from-the-factory Tiger. Nice. Things are going well.

I'm testing right now. Partly, I'm testing how Tiger does from a clean install. I'm also testing its interaction with my Panther Server. So far I have hit one, minor snag here, and it appears to be a bug in Tiger's Directory Access application.

In Panther Client's version of DA, the LDAPv3 configuration panel was set up by double-clicking the "LDAPv3" entry under "Services," then clicking "New..." at the bottom of the drop-down panel, and entering your server info in the available fields of the new entry. The Tiger version is slightly different: In Tiger, pressing "New..." brings up a dialogue box called "New LDAP Connection." Here you can enter information about how you want your client to use LDAP (i.e. for contact, or authentication, and whether or not to use SSL encryption). At the bottom of this dialogue is a "Manual" button which allows you to set up the panel the old, Panther-style way. Being old-school, I chose "Manual."

Silly me.

Turns out, setting up the server binding with the "Manual" button works, but the settings don't survive a restart. I tried this numerous times, and it would initially work, immediately binding to the server (which, by the way, is still Panther, and this may be part of the problem, but I seriously doubt it). But after restarting, though the entry would still exist in DA, the binding would be broken. No authentication, no computer management, nothing. The way to get it to work is to use the new, and supposedly improved, dialogue that pops up when you press "New..." in the LDAPv3 configuration panel -- the aforementioned "New LDAP Connection" window. Using this method to set up binding to my Mac Server worked. The new entry, once created in this manner, can be edited after the fact if need be. And best of all, the binding survives a reboot of the client.

Actually, the new dialogue does have one really nice thing going for it: It sets up the server paths in "Authentication" and, if you tell it to, in "Contacts." You used to have to set these up manually, in a separate steps, under the "Authentication" and "Contacts" panes, but now, if you check the appropriate checkboxes, DA does it all for you from the "New LDAP Connection" panel. Nice.

Well, it would be nice, if the "Manual" method worked. Still, it's better than a kick in the teeth.

So, with authentication working between Tiger client and Panther Server, I'm over halfway there. In addition to authentication, preference management (things like Login Items and such) appear to be working. The last thing on the client/server relationship checklist is services, mainly printing services. I'll be kicking this around for a bit, and then I'll be starting my Radmind work.

Just a preview of that: To start, I will install Radmind on my admin box and set it up a a Radmind server. Once that's all up and running (and I should really do some checking up to make sure Radmind is Tiger compatible), the odious task of setting up the master client from scratch will begin. To reiterate, this will be a machine that has everything that I plan to install on the various workstations on the floor. From this, sets will be made for each hardware/software configuration. Each change to the master client will have to be tracked by the server, starting with the base install, and building loadsets (overloads?) as software is added. This will take some time and patience, but it should be worth it in the end.

Let's hope.

So, when next we meet, I will be setting up Radmind. I will do my best to be as detailed and comprehensive in the documentation of this process as I possibly can be.

Oh yeah, and one other thing before I install Radmnd: I'm cloning my working Tiger install. 'Cause you never know.

UPDATE 1:
I have begun building my Master Client machine. The problem of server/client binding failing after reboot is now occuring on this machine, and using the "New LDAP Connection" window to set it up does not seem to work in this case. Thus far, I have been unable to bind my Master Client to the Macserver in a way that the binding survives a restart. This would seem to be the salient error in the system log:
Jun 27 19:43:45 systemsBoyMac /System/Library/CoreServices/mcxd.app/Contents/MacOS/mcxd: DSOpenNode(): dsOpenDirNode("/LDAPv3/192.168.1.10") == -14002

Simply opening the Directory Access application, authenticating, and opening the LDAPv3 configuration panel will bind the client to the server. The message that gets written to the log in this case is:
Jun 27 19:50:02 systemsBoyMac /System/Library/CoreServices/mcxd.app/Contents/Resources/MCXCacher: CacheUser(0, systemsboy) == -14136

I don't get it. I'll post when I find the solution.

UPDATE 2:
Deleted the /Library/Preferences/Directory Access directory -- which contains the preferences for, obviously, Directory Access -- rebooted (twice) and was able to login as a network user both times. I'd done this before. The differnce this time was that I logged in as a network user first, before logging in as a local user. I'm making a third attempt this very moment. The computer is rebooting... Now logging in as a local user... Good... Now as a network user... No luck! The login screen shakes it head at me. It would appear that setting up binding, rebooting, then logging in as a local user will break the binding. Man. That's fucked up.

More to come...

UPDATE 3:
So now I've trashed the DA prefs again. But when I launch the DA application, it's still set up with the old binding config. So apparently, there's some pretty nasty caching going on. I'm trashing all prefs in /Library now, and clearing the mcx_cache from NetInfo. Rebooting. Setting up the Network. Setting up DA. Rebooting. Now I'm bound, and network logins work. What a blast. I hate this shit. Trying again -- rebooting, logging in as a network user, success. One more time, this time local user logs in first -- reboot, wait... Binding is broken... WTF!? I must admit, I'm stumped.

Will keep you posted...

UPDATE 4:
Okay, now this is too bizarre. There are telltale signs that the client is not bound. One of them is that the "Other..." button does not appear in the list of users at the login window; there are only local users in the list. The other is that the server does not appear in /Network/Servers. When the client is bound, you will see a network mount (or actually, a symlink to the network mount) in this directory. So I'm poking around in the Terminal on my client, with the /Network/Servers window open, and I'm looking at logs and whatnot, and all of a sudden I see the server mount point show up. Out of nowhere. And it occurs to me, maybe the binding is just slow. So I reboot the client and just leave it at the login screen. After about three minutes, the "Other..." button shows up. Just pops right up. Client is bound.

I can reproduce this on two machines now, but I can't explain why it takes so long, and why this behavior is so inconsistent. My server is at 10.3.8 and has some strangeness about it. I will be looking at it to see if these problems are perhaps now manifesting themselves more obviously with Tiger clients. I will also consider upgrading the server to 10.3.9, as there are apparently changes to the LDAP schema that were made with that update that were in preparation for Tiger.

I'll let you know what I find...

UPDATE 5:
Another anomaly: Logging in to a bound client and navigating to /Network/Servers and selecting the server automount (we have a home account automount) causes the Finder to beachball indefinitely. Relaunching the Finder kills the beachball, but the network mount is broken (i.e. there is an empty mount point), and lots of automount errors in the system log. Oy! Time to fix/update the server.

I am now cloning my Macserver in preparation for moving to 10.3.9. Hopefully this will at least resolve the automount issues, and maybe even the slow binding issues. Of course, the ultimate solution will be to migrate to Tiger server. Still no idea when I'll be getting my disks, but when I do, there's a great article on the migration process at AFP548.

UPDATE 6:
I have updated my Panther Server to 10.3.9. Though the upgrade went smoothly, and I have experienced no problems with it, it has not solved my Tiger client problems, which are, to reiterate:
1. Binding to the server from a Tiger client takes approximately three minutes after reboot to occur. So there is a three minute period after a reboot during which a network user on a Tiger client cannot login. Suddenly, after three minutes or so, he/she can.
2. Network home account mounts from the Panther server do not automatically mount on the Tiger client. They should, and they do in Panther client. Here's the error message from the Tiger client system log:
Jun 30 16:42:18 systemsBoyMac automount[241]: Can't mount pantherServer:/Volumes/FlashDeveloper on /private/Network/Servers/pantherServer/Volumes/FlashDeveloper: Authentication error (80)

It's looking more and more like I'm going to have to install Tiger server before I can proceed much further.

Shit...

UPDATE 7:
Well, this is turning out to be more fun than a barrel of monkeys.

Finally got the network home account mounting and the user authenticating. It is not an automount problem, but rather an authentication problem. Apparenly, Tiger client cannot login to Panther server if the user's password is of type "Open Directory." Crypt passwords work fine. (I knew there was a reason I wanted to stay with crypt, but nooo, Apple said "Use Open Directory passwords. They're better." Yeah right.

I'm on my way to the Apple Discussions to see what I can dig up.

I'll let you know...

UPDATE 8:
(Hey, that rhymes.)

So I changed the password type of my network user to "Crypt" and could suddenly log in, right? Now here's something weird: I changed that same user's password back to "Open Directory," and guess what? It worked.

This would seem to indicate something screwy with the password server, but I'm hard pressed to say what it is. In any case, this presents a real problem for me, as I have about 50+ users with Open Directory passwords, and the way things are right now, they're not going to work in Tiger. The only way for me to change them is to get all 50 users to come in and change their passwords in Workgroup Manager, and that's just unacceptable to me.

This is why I long for a utility that lets OD admins change user password types without having to reset the password. This doesn't seem like a stretch to me, nor does it seem unreasonable, particularly if you have a lot of users, which I do. Because now I'm stuck with 50 or so users with Open Directory passwords that just won't work, and the only way to fix this is to reset their passwords, when really, all I want to do is change the password type to "Crypt," and all the data I would need to do that (i.e. the passwords) is there on the server. It's just inaccessible by the admin. So fine, give me a utility that lets me change the password type without resetting the password. I don't need to see the password to do this; the utility can do it. I just want to make the change, and I can't, and that's Bad. (And, BTW, the reverse could be true as well: What if you have 300 crypt-style users and you want to change them to Open Directory passwords? As it stands, I guess you're going to have a mightly long line outside your door.)

Anyway, from what I've read at the Apple Discussions, this sounds like it might be a problem not just with Panther servers, but with Tiger servers as well. I'm betting these are all upgraded or migrated servers, and that fresh Tiger (and maybe even Panther) server installs work just dandy, which is why only some people are experiencing problems.

Anyway, this is endlessly annoying. I'm done for the night...

Seeing the (Spot)Light

Well, as you can guess from my last article, and from the title of this one, I'm starting to see the potential, at least, of Spotlight. Initially, I was pretty down on it. Perhaps because I was expecting so much from it. I don't think I was alone in that regard. I think for a lot of people it was like going to see a movie that everyone said was incredible. They raved and raved about it forever, and by the time you got around to seeing it, you coudn't help but be let down. That's how Spotlight was for me: It's bugginess stood out like a sore thumb against the backdrop of hype and praise, and all I could do was hate it.

And, of course, turn it off.

So now I have my Powerbook (an old-ish 867MHz G4 Titanium) running 10.4 sans Spotlight. And it's fine. But in light (no pun intended) of my recent revelation, I wanted to give Tiger, and Spotlight, another try, and approach things a bit differently this time. For this attempt I have done the hard thing. I have wiped my G5. My G5 at work. A Dual 2.0 with 2.5 GB of RAM. By all accounts, a decent machine. Those two factors -- wipe and G5 -- have made a world of difference.

Up front I have to say, I still have not seen the much-touted performance gains of Tiger -- in fact, even on my G5, Tiger seems slower in many regards (opening/typing in emails, alphabetizing newly created folders, weird little things like that) -- and there are still plenty of Tiger and Spotlight bugs. But after performing an "Erase and Install" on my G5 and installing and updating Tiger from scratch, things are definitely better. Primarily in Spotlight.

Now, I have a lot of data. And I really think that the people who are saying they have no problems with Spotlight, and that it finds their files "instantly," just don't have very much data, particularly in the way of hundreds of thousands of small files, and more particularly, hundreds of thousands of small files that contain text. These are the files that, I think, really bring Spotlight almost to it's knees. For instance, I just searched the term "html." 41,300 or so hits were found. That took about 5 seconds. That's really not bad at all, but it's also nowhere near "instant." Nowhere near. Searching the term "doc" yielded around 160,000 and took about 15 seconds. It looks like, on my system, it takes Spotlight on average about 1 second to search 10,000 documents, give or take, depending on the most prevalent types of documents and their content. Wow. That's really good. But when I hear "instantly" and I get "15 seconds," well who wouldn't be disappointed? In any case, the major problems I was initially having with Spotlight locking up, beachballing, and being a general pain in the ass, seem to be mostly gone with this new install. And that's good. And I can say I'm much more inclined to see how great Spotlight is and will be.

(Though I will say that a recent seach, which yielded numerous results, and which I chose to view in Spotlight's "Show All" window, completely took over the processor for a few minutes when I flipped down one of the long lists of files. Scrolling became impossible. I could not close the window. There are still some serious performance problems with Spotlight, if only from time to time. But I suppose that's to be expected at this stage.)

The other thing that's helped me understand the usefulness of Spotlight is actually coming up with a reason for using it. Initially I had no real need for it, or at least I didn't think I did. But the other day I was looking for something in a script that did a particular something or other. Problem was I had no idea what script I had put that something-or-other line in. So I'm looking through nested folder upon nested folder, trying to decipher my own script names, and realizing that it was all futile because I had no idea what the script was about, only that it had a line in it that used such-and-such a command. And that's when it hit me: how great would it be if I could search inside my scripts? And then it hit me again: Spotlight!

Shit!

So now that I have a reasonably well-working install of Tiger, and a good reason to use Spotlight, I'll be testing and using them on a daily basis. So far (it's only been a day, really) things are good. They could be better, but I'll save the complaints for another day. I would like to say, though, on a final note, that the one thing Spotlight really needs is more customization options. The above is a good example of when I'd really like to use Spotlight, but I don't always want to use it. Sometimes I prefer a simple name-search. And I'd really like to be able to toggle between the two. Easily! (Maybe via a dropdown a-la the old Panther-style Finder window search?) So yeah, customization would be real nice. We all know this. Apple surely knows it as well. Hopefully it's something they'll act on quickly, like during the 10.4 product cycle (like now), and not after Leopard finally arrives. I would really hate to see Spotlight go the Sherlock route of neglect.

In the meantime, I'll be eagerly awaiting the 10.4.2 update. Here's hoping it's a good one.

Spotlight vs. Google: Metaphor Smackdown

I just had a long, really interesting, fiesty conversation with a fellow systems geek (though, by his account, he is the geek and I am the nerd) about Spotlight, and partially about the Spotlight/Google metaphor that's floating around various sectors of the internet, and that's captured the minds and hearts of Mac and Google fans alike. You know: "Spotlight is like Google for your hard drive." (Okay, it's a simile, but whatever.)

My argument was that the comparison is inherently flawed. That searching your hard drive is fundamentally different than searching the internet. His argument was that they're the same, it's just the presentation that's different. Walking home I had time to consider the whole matter on my own, and I have to say, I may be starting to come around.

Basically, what I was saying was that when you're searching your hard drive, you're searching for files. And not just any files, but (usually) your own files. Files you created. Files you filed. Files you're fairly intimately familiar with. Whereas, when searching the internet, you're actually not searching for files at all. Your searching for content. And searching for content is different than searching for files. It's a fundamentally different way of approaching a search. On your hard drive you generally think to yourself, "I am looking for the file called 'smith' in my 'Documents' folder." On the internet you think, "I am looking for information about smith."

Now, when you search your hard drive using Spotlight, you are essentially searching the content of files (and, actually, a whole lot of other information about those files, but let's forget about that for the moment). So, in that respect, I will concede, it is quite a bit like doing a Google search. The main difference, or the crucial difference really, is, as my friend said, the presentation of the results. Google gives much more useful content-oriented information, in the form of those little, pertinent text passages at the bottom of each search result, than Spotlight does. Spotlight presents content-based search results the same way it presents file-based seach results, usually by kind primarily, and by name secondarily, but unfortunately never, ever by content. You can't even see the content of these content-based serch results without opening the file. If Google worked that way it would be worthless.

The heart of my original argument was that that's not how we're used to thinking about a local search, so it's inappropriate to discuss it that way. And therein lies my shortsightedness and my friend's and Apple's (and probably a lot of other folks') genius. Their contention is that we should think of local file searches differently, even if some of us don't quite get that yet. (I'm slow. I admit it.) And one useful way of thinking about local searches is the Google approach. I finally got this on my way home, when I began to think about what my friend had said about presentation. I started imaging local search results presented the way Google presents internet search results, and I got it. And I said to myself, "Yes. That would be awesome."

I do still have some problems with Spotlight, and I realize now that many of them revolve around that idea of presentation. I mean, it seems a bit narrow to be able to search by all sorts of criteria (name, kind, content, etc) and not really be able to organize the results by said criteria. Also, Spotlight always searches by all criteria. Wouldn't it be great if you could search by specific criteria only? Or combinations of criteria? And I guess you can, to some extent, with the standard "Find" command or with the weird boolean syntax in the Spotlight menu-bar. And that's great. But still, despite the fact that you can search by content, you can never display your results by content, and that's a problem, in my mind. Seems to me that when the Spotlight developers designed the Spotlight search-results interface, they were having the same problem making the conceptual leap that I was having. They missed the Google/content thing entirely.

But all the problems with Spotlight can, and probably will, be resolved when Spotlight becomes customizable. Which it will. It's a beast. A good and loyal beast, but a beast nonetheless, and it could use some reigns to really make it the incredibly useful thing it's bound to someday be. And I truly hope that some of that future customizability will revolve around the idea of search-results presentation. Because my friend was right. That's the key, or at least a big part of the key.

And you know? He's right about one other thing too: I am a nerd.

Shit. I hate that guy.