Seeing the (Spot)Light

Well, as you can guess from my last article, and from the title of this one, I'm starting to see the potential, at least, of Spotlight. Initially, I was pretty down on it. Perhaps because I was expecting so much from it. I don't think I was alone in that regard. I think for a lot of people it was like going to see a movie that everyone said was incredible. They raved and raved about it forever, and by the time you got around to seeing it, you coudn't help but be let down. That's how Spotlight was for me: It's bugginess stood out like a sore thumb against the backdrop of hype and praise, and all I could do was hate it.

And, of course, turn it off.

So now I have my Powerbook (an old-ish 867MHz G4 Titanium) running 10.4 sans Spotlight. And it's fine. But in light (no pun intended) of my recent revelation, I wanted to give Tiger, and Spotlight, another try, and approach things a bit differently this time. For this attempt I have done the hard thing. I have wiped my G5. My G5 at work. A Dual 2.0 with 2.5 GB of RAM. By all accounts, a decent machine. Those two factors -- wipe and G5 -- have made a world of difference.

Up front I have to say, I still have not seen the much-touted performance gains of Tiger -- in fact, even on my G5, Tiger seems slower in many regards (opening/typing in emails, alphabetizing newly created folders, weird little things like that) -- and there are still plenty of Tiger and Spotlight bugs. But after performing an "Erase and Install" on my G5 and installing and updating Tiger from scratch, things are definitely better. Primarily in Spotlight.

Now, I have a lot of data. And I really think that the people who are saying they have no problems with Spotlight, and that it finds their files "instantly," just don't have very much data, particularly in the way of hundreds of thousands of small files, and more particularly, hundreds of thousands of small files that contain text. These are the files that, I think, really bring Spotlight almost to it's knees. For instance, I just searched the term "html." 41,300 or so hits were found. That took about 5 seconds. That's really not bad at all, but it's also nowhere near "instant." Nowhere near. Searching the term "doc" yielded around 160,000 and took about 15 seconds. It looks like, on my system, it takes Spotlight on average about 1 second to search 10,000 documents, give or take, depending on the most prevalent types of documents and their content. Wow. That's really good. But when I hear "instantly" and I get "15 seconds," well who wouldn't be disappointed? In any case, the major problems I was initially having with Spotlight locking up, beachballing, and being a general pain in the ass, seem to be mostly gone with this new install. And that's good. And I can say I'm much more inclined to see how great Spotlight is and will be.

(Though I will say that a recent seach, which yielded numerous results, and which I chose to view in Spotlight's "Show All" window, completely took over the processor for a few minutes when I flipped down one of the long lists of files. Scrolling became impossible. I could not close the window. There are still some serious performance problems with Spotlight, if only from time to time. But I suppose that's to be expected at this stage.)

The other thing that's helped me understand the usefulness of Spotlight is actually coming up with a reason for using it. Initially I had no real need for it, or at least I didn't think I did. But the other day I was looking for something in a script that did a particular something or other. Problem was I had no idea what script I had put that something-or-other line in. So I'm looking through nested folder upon nested folder, trying to decipher my own script names, and realizing that it was all futile because I had no idea what the script was about, only that it had a line in it that used such-and-such a command. And that's when it hit me: how great would it be if I could search inside my scripts? And then it hit me again: Spotlight!

Shit!

So now that I have a reasonably well-working install of Tiger, and a good reason to use Spotlight, I'll be testing and using them on a daily basis. So far (it's only been a day, really) things are good. They could be better, but I'll save the complaints for another day. I would like to say, though, on a final note, that the one thing Spotlight really needs is more customization options. The above is a good example of when I'd really like to use Spotlight, but I don't always want to use it. Sometimes I prefer a simple name-search. And I'd really like to be able to toggle between the two. Easily! (Maybe via a dropdown a-la the old Panther-style Finder window search?) So yeah, customization would be real nice. We all know this. Apple surely knows it as well. Hopefully it's something they'll act on quickly, like during the 10.4 product cycle (like now), and not after Leopard finally arrives. I would really hate to see Spotlight go the Sherlock route of neglect.

In the meantime, I'll be eagerly awaiting the 10.4.2 update. Here's hoping it's a good one.

Spotlight vs. Google: Metaphor Smackdown

I just had a long, really interesting, fiesty conversation with a fellow systems geek (though, by his account, he is the geek and I am the nerd) about Spotlight, and partially about the Spotlight/Google metaphor that's floating around various sectors of the internet, and that's captured the minds and hearts of Mac and Google fans alike. You know: "Spotlight is like Google for your hard drive." (Okay, it's a simile, but whatever.)

My argument was that the comparison is inherently flawed. That searching your hard drive is fundamentally different than searching the internet. His argument was that they're the same, it's just the presentation that's different. Walking home I had time to consider the whole matter on my own, and I have to say, I may be starting to come around.

Basically, what I was saying was that when you're searching your hard drive, you're searching for files. And not just any files, but (usually) your own files. Files you created. Files you filed. Files you're fairly intimately familiar with. Whereas, when searching the internet, you're actually not searching for files at all. Your searching for content. And searching for content is different than searching for files. It's a fundamentally different way of approaching a search. On your hard drive you generally think to yourself, "I am looking for the file called 'smith' in my 'Documents' folder." On the internet you think, "I am looking for information about smith."

Now, when you search your hard drive using Spotlight, you are essentially searching the content of files (and, actually, a whole lot of other information about those files, but let's forget about that for the moment). So, in that respect, I will concede, it is quite a bit like doing a Google search. The main difference, or the crucial difference really, is, as my friend said, the presentation of the results. Google gives much more useful content-oriented information, in the form of those little, pertinent text passages at the bottom of each search result, than Spotlight does. Spotlight presents content-based search results the same way it presents file-based seach results, usually by kind primarily, and by name secondarily, but unfortunately never, ever by content. You can't even see the content of these content-based serch results without opening the file. If Google worked that way it would be worthless.

The heart of my original argument was that that's not how we're used to thinking about a local search, so it's inappropriate to discuss it that way. And therein lies my shortsightedness and my friend's and Apple's (and probably a lot of other folks') genius. Their contention is that we should think of local file searches differently, even if some of us don't quite get that yet. (I'm slow. I admit it.) And one useful way of thinking about local searches is the Google approach. I finally got this on my way home, when I began to think about what my friend had said about presentation. I started imaging local search results presented the way Google presents internet search results, and I got it. And I said to myself, "Yes. That would be awesome."

I do still have some problems with Spotlight, and I realize now that many of them revolve around that idea of presentation. I mean, it seems a bit narrow to be able to search by all sorts of criteria (name, kind, content, etc) and not really be able to organize the results by said criteria. Also, Spotlight always searches by all criteria. Wouldn't it be great if you could search by specific criteria only? Or combinations of criteria? And I guess you can, to some extent, with the standard "Find" command or with the weird boolean syntax in the Spotlight menu-bar. And that's great. But still, despite the fact that you can search by content, you can never display your results by content, and that's a problem, in my mind. Seems to me that when the Spotlight developers designed the Spotlight search-results interface, they were having the same problem making the conceptual leap that I was having. They missed the Google/content thing entirely.

But all the problems with Spotlight can, and probably will, be resolved when Spotlight becomes customizable. Which it will. It's a beast. A good and loyal beast, but a beast nonetheless, and it could use some reigns to really make it the incredibly useful thing it's bound to someday be. And I truly hope that some of that future customizability will revolve around the idea of search-results presentation. Because my friend was right. That's the key, or at least a big part of the key.

And you know? He's right about one other thing too: I am a nerd.

Shit. I hate that guy.

Tiger Lab Migration Part 1a: Snags!

Snags! I've hit snags! And in the most basic part of this migration.

I don't know what it is, but it sure seems like any time I want to do something ambitious I end up having the weirdest problems. This lab migration is no exception. My first step is cloning my system drive, in case I need to revert back to the previous working state for some reason. A tedious but necessary precaution. And drop-dead simple, right? I mean, how many drives have I cloned in my lifetime? Well, I don't know for sure, but I lost count somewhere around twelve-bajillion or so. And how many times have I had a problem with it? Maybe three. And usually it was because I was doing something stupid.

But today, of all days, when I'm finally ready to take the plunge, wipe my system and install the dreaded Tiger, I find myself unable to successfully clone my drive to a disk image. Everytime (and I've tried Carbon Copy Cloner and Apple's Disk Utility) I get the same error message telling me that the disk image/folder is too big. Too big for what? I'm cloning a 13 GB volume to a 230 GB hard drive. Seem like I should have plenty of space. The error occurs, in all cases, right after the initial sparse image is created, and right before the ASR scan/conversion begins.

I tell you, I'm stymied.

I'm on my fourth attempt at this point, and this time, rather then being booted from the root drive, I'm mounting the system in firewire target disk mode, and cloning on a seperate system. (See? The advantages of having a lab full of computers. Nice.) My reasoning here is that I'm worried that there's something terribly wrong with my boot drive, and it's confusing the hell out of hdiutil. (Did I mention, the error message is from hdiutil?) So I figured I'd try from a presumably happy, healthy boot drive and see what happens.

I've been doing this all day, and it's getting pretty old. So, while my attempts at cloning run I am also:
1) Writing this blog (obviously)
2) Reading other blogs
3) Installing Red Hat Linux on a Windows box.

I am surrounded by progress bars, and yet I can't seem to make any progress.

Systems work can sure be frustrating sometimes.

Oh well.
_______
Update 1:
Argh! It happened again! Below is a screen capture of the error message.

Also, SprintPCS has been down for two days doing "maintenance." Probably removing cool features and replacing them with shitty ones. This seems to be a general tren in the industry. But I'll tell you, if I took so long to do maintenance, I'd be fired. What this means for me, of course, is that I cannot upload any photos.

Apparently, I can't do anything today. Maybe I should just go home.

Again I say, Argh!

_______
Update 2
Well, I've figured out the problem. Should've just googled that error message in the first place, but no, that would've been far too quick and easy. The problem is a bug in hdiutil and OSX 10.3.9 that prevents creating an asr restore image of disk images over 8 GB. So right this very moment I'm making a lean OSX 10.3.2 boot disk on a 6 GB firewire drive I have lying around. I'll boot off that drive to make my restore image. Hopefully, that will do the trick and I can move on.

Hmph! I knew there was a reason I didin't want to upgrade to 10.3.9. And there it is.


Um... WTF?

Tiger Lab Migration Part 1: Introduction

First, a very little about me: I run a Mac lab at an art school. I am in charge of about 30 Macs that are used for a huge variety of things, including office work, graphics, video, interactive authoring, web authoring, the teaching of these topics, and just about anything else you can think of. I manage multiple hardware configurations in an extremely heterogeneous network that consists of Windows and Linux machines in addition to my Macs. I run two primary servers: a Quicktime Streaming Server, and a Macserver that handles login authentication and other network services (including print services and preference management and maybe a few other things I'm forgetting offhand). I also run a test server or two, and soon I'll be running (if all goes well) an LDAP replica for my Macserver.

Phew! Okay...

As the administrator of this lab, I am constantly being asked if I will be upgrading to the latest, greatest Mac OS. This Summer is no different. And I just want to outline, if only for myself, some of the pros and cons, as well as how I might proceed, in doing so this year.

First let me say, the release date of Tiger affords me the perfect opportunity to upgrade. Previous revisions of the Mac OS always seemed to come at odd times, often in the middle of the semester, and made it inconvinent if not downright stupid to upgrade immediately. Usually what I would do in these instances was test the new OS and then, depending on how important the upgrade was and how smoothly the transition could be made, either upgrade between semesters or wait until Summer. Summer, you see, gives me a full three months to plan, test, and implement an OS upgrade. So I'll say right here and now that the release date of Tiger makes it almost certain that I will upgrade this Summer. Tiger is complex enough that it's not something I'd like to attempt between semsesters, when my time is much more limited. Waiting would likely postpone an upgrade to next year, and that's unacceptable to me, and would probably become unacceptable to students and faculty in the near future, when the real Tiger benefits start making themselves more apparent.

What I mean by this is that, from everything I've read, a lot of the really good, juicy, exciting changes to Tiger -- the things that will really be a boon to users -- are low-level. For instance, the offloading of many tasks to the graphics card. While these changes might not be readily obvious, or even useful, at the moment, when new apps like Final Cut and Motion begin shipping, I think we'll see some really good reasons to upgrade, and I'll be kicking myself if I haven't. Worse, my students will be kicking me. (Yowch! That's a lot of kicking.) I suspect Tiger will be really good for video. And we do a lot of video on our Macs. With the FCP Suite soon to hit the streets, I've one more big reason to upgrade.

The low-level changes, however, are also what make this, in my mind, such a challenging update. I've installed Tiger on my Powerbook for testing using the "Archive and Install" feature, and I must say, I've been underwhelmed with the performance of the new system. Most people are claiming that Tiger brings with it performance gains, but I've not seen them. If anything, my Powerbook seems a bit slower than it was on Panther. Mind you, I haven't done a clean install on that machine since I bought it (three years ago), so there's probably plenty of old junk that needs cleaning out (like, maybe, all those old, non-binary preference files) and I can't help wondering if an "Erase and Install" would have been the better route. Nevertheless, I may prove far too lazy to ever attempt such a process, as re-installing all my apps would take days, and I just don't have days.

Enter The Lab.

The great thing about working in education (or one of them anyway) is that I get to test and try stuff out on someone else's hardware, and I have a lot of machines at my disposal upon which to do so. The rationale for all this is that I will then implement, based on these tests, and build a productive and efficient lab for all to use. Ideally, the whole lab benefits from my test experiences. So, though I may not be inclined to Erase and Install on my Powerbook, or test generally on my personal computers, The Lab is the perfect place to try such things.

The thing about Tiger is that it's complex. It's a beast. There's a lot of stuff that I've implemented in Panther that will break instantly, and a great deal more that will need serious modification and coordination. I use rsyncx for backups of staff machines, for instance. What is the best way to seamlessly preserve that functionality given the facts that A) Tiger introduces a whole new, resource-fork-aware, version of rsync, B) the staff machines will probably be among the last to get upgraded, C) my machine, which performs the backups, will be the first to receive the upgrade, and D) the versions of rsync must match between client and server? It's a chicken-or-egg problem that will require a hack to workaround, but it's do-able. But what about my Macserver? What will be the interaction there? Should I upgrade the server first and then the clients, or vice-versa? (I still have not received my copy of Tiger Server, BTW, so that pretty much answers that question.) There are a whole host of questions and problems like these that will require some serious testing and planning.

So, at this point, I've pretty much decided on a first stage of my migration plan. And since Tiger is so complex, I've decided to shake off my complacency and implement this upgrade boldly, and I might add, from scratch. Yes, scratch.

Here's The Plan.

There are a few things I've been wanting to do that this Tiger-migration-from-scratch really gives me a good chance to do. The first thing is repartitioning. In the past I've partitioned my lab computer drives into two partitions: a SysApps partition that holds, what else, the system and application components, and a Work partition from which students can, yes, work. With the ever-increasing list of applications installed on my systems, and their ever-increating size, my SysApps partition is getting a little cramped at 20 gigs. And with hard drives getting bigger all the time, and students primarily relying on their firewire drives or network storage at this point, it seems like a good time to repartition. I'm thinking that SysApps will grow to 50-75 GB in the new scheme.

The second thing I want to try is to implement Radmind to track and monitor system configurations and to apply updates over the network. This may be overkill given the size of my lab. Radmind is really meant for managers with hundreds of computers and far more resources than I have at my disposal. And it's really not necessary for a lab of thirty or so Macs. Still, I see an opportunity here to try something new and learn. But also, there may be a real advantage to implemeting Radmind, for a lot of the reasons I complained about with regards to this and past upgrades. Seems to me like Radmind could effectively take some of the sting out of upgrades. Radmind allows the admin the ability to create sets of updates to both applications and the OS in what amount to layers on top of the base install. Ideally I want to create a system whereby updating the entire lab to a new OS revision is as easy as a few mouse clicks, so that after I've gone and tested Leopard, upgrading the lab will simply mean upgrading one machine and then porting those changes to the Macs in the lab. And, should there be a major problem I've overlooked, reverting will be just as fast and easy. Again, this is best handled from scratch, starting with a clean install of the OS for the base config.

Finally, if my Powerbook is any indication, I think that Tiger will be a much smoother upgrade going from scratch, and doing this gives me an opportunity to rethink, to some extent, my current implementation of the Mac lab. Things are working pretty well, but they could always be better.

Where to begin?

The lab admin is always Guinea Pig Number One, so I'll be starting, of course, with my machine. Yup, that's right. Today is the day. I'm wiping my admin box and installing Tiger. Needless to say, I will be backing up my Panther system with Mike Bombich's wonderful Carbon Copy Cloner, just in case I need to get back to a fully working system. If I could, I'd clone a working copy to a spare firewire drive and boot off that if I needed to, but alas, the downside to the smaller educational environment is that you don't have endless hardware resources, and so, no such drive is available to me now. I'm going to have to really take the plunge here, making a clone image for backup, and if necessary, restoring over my fresh Tiger install. This will be good motivation, however, to stay with Tiger unless it's absolutely imperative I revert.

After wiping and installing Tiger, I'll need to get backups online as fast as possible. This is my first priority. I will need to continue to use rsyncx as a stop-gap until I can get staff machines upgraded. Upgrading staff machines will be the most problematic and scary, as these machines are in continual use throughout the Summer, and as they have data and applications that absolutely must be preserved and in working order as fast and as seamlessly as possible (yet another reason to make sure my backups are working before proceeding). That being the case, I may opt to use the "Archive and Install method on those systems, but that decision can wait a bit. One potential major fly in the ointment is the possibility that rsyncx won't work in Tiger. If that's the case, I will need to upgrade my staff machines sooner (read: much sooner) than later. Like fast!

Next, I should probably start installing applications, just to get back up to speed. But this introduces an interesting problem: Perhaps my next step should be to install Radmind, test it, and then use my machine as the base config. My machine is different than all the other lab machines, though, so actually it might be unwise to use it as a model upon which to base all the other lab Macs. Using an admin machine for this model also seems like a bit of a security risk, so I probably won't begin with Radmind on my system. I think the best way to proceed will be to get my apps installed and get my system to a useful state again, then start building my Radmind server, modeling the base config on a lab workstation, which will also need to be built from scratch. This will require a lot of extra work, but I believe it's the best way to go. In any case, this is a really good example of the kind of logistical problems involved in a major upgrade such as this.

Once all my important apps are installed and my machine is doing the things I need it to do for work, then I can begin hammering Tiger for faults and problems some more. At this point I'm pretty familiar with all the changes, so this stage shouldn't take too long. The trickiest part will be getting things like cron set up again and porting over all my Startup Items from my backup. (I have some custom Startup Items that do lab-specific things that I'd really rather not delve into here. Suffice to say, these things, too, will likely require some testing and tweaking.)

When everything on my system is back up to speed and working to my general satisfaction, it will be time to test how Tiger client inteacts with Panther Server. That is, can I still log in as a network user? Do print services still work? What about managed preferences? If the answer to any of these questions is "no," then I need to figure out the best way to migrate my Macserver to Tiger before updating the lab's workstations (at which point you can expect another article). If the Tiger workstations can be peacefully managed from the Panther Macserver, I can wait a bit to upgrade the server. This would be optimal, and there is a good chance that this will be the case. I've already tested logins, and I know they work. Hopefully the rest works too, with minimal changes, as I'd like to (and may be forced to, as I have no idea when to expect my Tiger Server CD) update the server last.

And just an aside on that last comment: The logic behind upgrading the server last, I realize, is a bit skewed. The main idea is that, if I upgrade the server first, it is more likely to break my clients than if I upgrade the clients first. Though this may not actually be the case, my primary concern here is user logins. If those break, I essentially have no lab. Other services I can live without, but user logins are paramount. And I already know they work, so it's a much safer bet to upgrade the clients first. Also, there is the plain fact that I have no Tiger Server CD and I want to get moving. Finally, doing the client side the way I am is a much bigger job than the server is likely to be, and I want to get started on the migration now. In a perfect world, I suppose I'd build a Tiger Server on a seperate machine, and it's client on yet another machine and do my test builds in an all-Tiger environment. Alas, that is just not possible for me at this time. I am forced to be a bit more messy than that. But this is the only way I can move forward right now. And it should be fine.

Okay, so Tiger is installed on my admin box, backups are going again, apps are up and running, admin stuff is generally working. Great. Now it's time to start building my Radmind server. (And let me just say that, before I proceed, I may clone this fresh, working version of Tiger to a disk image, just in case.) The Radmind setup process is twofold. For the first part, I need to install the entire Radmind package on my admin machine and set it up as a server. (I will not go into Radmind setup in this article, as I don't know it nearly well enough to describe it from memory, but perhaps in another article.) For the second part, I need to begin building my master client. Ideally, the master client will be a conglomerate of all the various configurations in the lab, built in stages. Each stage represents an installation set. This will be one heavy machine. Essentailly we'll install everything we have on it. It will also take any new updates in the future. These updates will be tracked as well. Finally, sets of installs for various configurations of Macs can be built. For instance, some of our Macs run Motion. Motion will get installed on the master machine, and any machine that runs Motion will be put into the "Motion Computers" list, which defines what computers get what software/updates. To install Motion, or anytime it's updated, we install on or update the Master client and tell Radmind to update all the computers in "Motion Computers." Cool! So this Master client needs to get built first, and very carefully, tracking each change along the way with Radmind. It's probably a good idea to start with a list of the various system configurations in the lab, i.e. which systems get which software. This will eventually become the "Computers" list(s) in Radmind.

(By the way, I you might be wondering why I don't manage all this from Tiger Server's "Software Update Server." Well, from my understanding, Tiger Server only updates Apple software on clients. Radmind can update anything. Plus, Tiger Server doesn't let you revert changes, which is one sweet feature of Radmind. Radmind is very complex, but vastly more powerful in what it does, as it is specialized for that application. Tiger Server is really a different beast that will probably never be able to do everything Radmind does. If it ever can, it will probably be because Apple has bundled Radmind with Tiger Server. Which I could see happening someday, actually.)

Okay, this is where things get a bit hazy. But that's alright, because this stage won't be happening for at least another few weeks, and by then I'll have a lot more of the information gaps filled in my roadmap. The final stage will be to wipe all the machines on the floor, repartition them, install Tiger on them, and t

hen install Radmind on them. Actually, I'll probably just build one general system, set it up as a Radmind client, and clone it to my remaining systems. Once all that is done, it's time to whip out my Radmind. At this point it should just be a matter of designing sets in Radmind for each hardware/software configuration in the lab, and then simply telling Radmind to setup the systems. The staff machines can get updated then too, or along the way (though I may or may not manage them with Radmind -- staff machines are still up in the air at this point, and if I do use Radmind on them, I may make a completely new Staff Master config). And then I can turn my attention to the server. But that shouldn't be too bad, right? (I can't believe I just said that!) The major hurdle is the lab migration and Radmind implementation. Once that's under way, life should be pretty good.

Sound easy? My life should be so easy.

I'll keep you posted.

Another Tiger Bug

Tiger has some genuine downright bugs. I guess this is to be expected, but man it does get old. The Panther upgrade was extremely smooth compared to this. Here's another honest-to-goodness bug I've found:
Under the Accounts Preference Pane you are no longer able to drag Login Items to define their startup order. According to an Apple Tiger support page you should be able to do this. Which in my mind means it's a bug.

Neat-O.