UNIX as Literature or: Why Visual Artists Suck at Shell Scripting

I just read a terrific article entitled "The Elements Of Style: UNIX As Literature" by "feral information theorist" Thomas Scoville. In it, Scoville proposes that UNIX has a certain appeal to literary types because of its reliance on text rather than image, and that the GUI has really been successful because we live in an increasingly visually oriented culture.

This article got me thinking about a few things. The first is that I wonder if the reverse of his premise might also apply to my situation. Since the advent of the UNIX-based Mac OS, I've been forced to use the UNIX command-line. Perhaps not coincidentally, I've been writing more. I write this blog. I write more instructions for my users. And I've become a much better and faster typist (I can almost touch type now). I wonder if there is a relationship between my increasing reliance on and fondness of UNIX and the fact that I now write a great deal about technology. I wonder if it's possible that the former has influenced the latter. I wonder if learning UNIX has fundamentally changed my brain in some small but important ways. Is UNIX responsible for The Adventures of Systems Boy?

The second thing I got thinking about is the idea that, if Scoville's theory holds, visual artists would tend to be pretty lousy and generally uninterested in the command-line. This effects me in a number of ways. For one, I am a visual artist. I used to paint and now I make videos. But I've also always had a strong technical side to my personality and interests, though it took me quite some time to actually warm to UNIX. (I remember my first foreach loop. That was the defining moment for me, really, as there was no way to do anything like that in the GUI.) For two, many of my freelance clients are visual artists. Most of them are fairly clueless as to the intricacies of systems or network setups, which I guess is why they need me. And lastly, I work in an art school and am surrounded by would-be visual artists. I've often lamented the inability of many of my students to grasp fundamental technical concepts, and as I've grown more dependent upon UNIX, I've found it frustrating when trying to teach people basic command-line skills. Artists, by and large, just plain tend to suck at shell scripting. I've usually written this off as people just being generally uninterested in such arcane and sometimes difficult tools as UNIX (as I was myself, initially). But Scoville's theory suggests that visual artists in particular may be especially immune to the charms of the command-line. They just might not be wired for it. I'll have to factor this into my thinking the next time I want to throttle someone for making me describe for the umpteenth time the syntax of ls. It's just not their fault.

The last thing I got to thinking about was Mac OS X. The real beauty of Mac OS X, in my mind, has always been that it unites two distinctly different — both the visual and the linguistic — ways of thinking. Mac OS X employs a visually beautiful GUI over a UNIX foundation. It's the best of both worlds in one system, equally appealing to both the aesthetes and the geeks. This is truly remarkable, and goes one more step in explaining why it's my operating system of choice. For someone like me — a visual artist with strong technical leanings who works with a multitude of other artists — Mac OS X represents the ideal operating system. And whether people realize it or not, I think this has at least something to do with its broader appeal.

I also think I may just have made the perfect career choice when I took a job as a SysAdmin in an art school.

External Network Unification Part 3: Migrating to Joomla

When last we visited this issue I had just gotten the venerable Joomla CMS to authenticate to our LDAP server. I decided to build a replica of our existing CMS, which is based on the Mambo core, and do some testing to see how easy it would be to port to our LDAP-saavy Joomla core. The Joomla site gives instructions for doing this, and frankly it sounded drop-dead simple.

It turns out it is drop-dead simple. The hard part is successfully replicating the existing Mambo site and all its requisite databases and components. To do this, I first copied all the files over to the test server. This is easy of course. Then I had to get the MySQL databases over. This was a bit more challenging. Using the command mysqldump was the way to go, but I encountered numerous errors when attempting this with the standard options. After some research I discovered that I needed to apply the --skip-opt option to the command. My final command looked something like this:

mysqldump --skip-opt -u mambosql -p -B TheDatabase > TheDatabase.bak.sql

I honestly don't remember why the --skip-opt flag was necessary, or even if it was the right approach, only that it seemed to do the trick: the dump completed without errors. So I copied the database over and set everything up on my test server exactly as it was on the original server, putting the Mambo site in the proper web root, and importing the databases on the test system. After some fidgeting — specifically, making sure the Mambo config file was edited to use the new server — I was able to get the test site working. The only problem was (and still is) that the admin pages don't work. No matter what I do, I can 't login and I'm told that my username and password are wrong, though they work on the front end. I suspect a problem with my dump. It's also possible that the admin pages require a different user — one I'm unaware of — than the front-end for access. Since I didn't build the original server, I can't be sure. But whatever.

The next part of this test was to try and port the Mambo install to the Joomla engine with the LDAP hack enabled. This turned out to be fairly straightforward: Install and configure Joomla (v.1.0.8 — later versions do not work with the LDAP hack) to authenticate to LDAP; copy over all the custom Mambo files to the new Joomla site (without overwriting any Joomla stuff); copy the Mambo config file over and edit it for the new site root; trash the "Installation" folder (we won't be needing it); and that was it. My old Mambo site was now running on an LDAP-enabled Joomla engine.

There were some major snags here though. Because I could not get into the admin pages (a problem that persisted even with the new Joomla engine), I could not configure user authentication. I was able to directly access the MySQL database, however, with phpMyAdmin. Here I was able to edit my user account to use LDAP rather than using the password stored in the MySQL database by entering "@LDAP" into the password field. This worked well in fact.

One feature, however — automatic user creation — did not work so well. That is, if a new user logs in — a user that doesn't yet exist in the MySQL database, but does exist on the LDAP server — what the LDAP Hack does is create the new user in the MySQL database with a flag that says, "Get this user's password from the LDAP database." Logging in as a new user on my test Joomla server produced erratic results. I'm assuming that this had something to do with the lack of admin access to the MySQL database.

Still, we've accomplished some things here. For one, we've figured out a method for porting our current Mambo CMS to an LDAP-enabled Joomla engine. Secondly, we've shown, at least in theory, that this system can work with LDAP. The next step will be to try all this out on a copy of our live Mambo CMS on the actual web server. Hopefully, when we do that, access to the admin pages will function normally and the LDAP hack can be configured so that new users are properly added at login. If all goes well, our CMS will be authenticating to LDAP in the next post in this series.

If all goes well.

Remote Network (and More!) Management via the Command-Line

There's a lot you can do in the Terminal in Mac OS X. It's often the most expedient method of achieving a given task. But I always get stuck when it comes time to make certain network settings via the command-line. For instance, in the lab all our machines are on Built-In Ethernet, and all are configured manually with IP addresses, subnet masks, routers, DNS servers and search domains. But sometimes we need to make a change. While I suppose it's easy enough to go over to the computer, log in, open the System Preferences, click on the Network pane, and configure it by hand in the GUI, this becomes a problem when you have to interrupt a staff member busily working on whatever tasks staff members tend to busily work on. Also, it doesn't scale: If I have to make a change on all 25 Macs in the lab, the above process becomes tedious and error-prone, and exponentially so as we add systems to the network. When performing such operations on multiple computers, I generally turn to Apple's wonderful Remote Desktop. But ARD (I'm still using version 2.2) lacks the ability to change network settings from the GUI. Fortunately, there is a way.

ARD comes with command-line tools. They're buried deep in the ARD application package, for some reason, making them a serious pain to get at, but they're there and they're handy as hell in situations like the above. You can use these tools to set up virtually anything normally accessible via the System Preferences GUI. Including network settings. There are two main commands for doing this:
systemsetup for making general system settings, and networksetup for making network settings.

Both commands are located in the folder:

/System/Library/CoreServices/RemoteManagement/ARDAgent.app/Contents/Support

You'll need to type the full path to use the commands, or cd into the directory and call them thusly:

./networksetup

Or you can do what I do and add an alias to them in your .bash_profile:

alias networksetup='sudo /System/Library/CoreServices/RemoteManagement/ARDAgent.app/Contents/Support/networksetup'

There are no man pages for these commands, but to get an idea of what they do and how they work, just run them without any arguments and they'll print a list of functions. Or you can run the commands with the -printcommands flag to get an abbreviated list of functions. There is also a PDF reference online. The syntax for these tools is not always straightforward, but by and large looks something like this:

networksetup -option <"network service"> <parameter 1> <parameter 2> <etc...>

For "network service" you'll want to enter the name of the interface as it's called in the preference pane, surrounded by quotes. You also need to run these commands as root. Here's an example that makes network settings (IP address, subnet mask and router, respectively) for Built-In Ethernet:

sudo networksetup -setmanual "Built-In Ethernet" 192.168.1.100 255.255.255.0 192.168.1.1

Here's an example for setting multiple search domains:

sudo networksetup -setsearchdomains "Built-In Ethernet" systemsboy.blog systemsboy.blog.net

Why these tools aren't included in the GUI version of ARD is almost as confounding as the fact that they're buried deep within the guts of the app. The wordy syntax doesn't help either (I'd much prefer typing "en0" to "Built-In Ethernet" any day). But despite being difficult to access, I've found these commands to be unbelievably useful from time to time. And though I certainly do hope to edify TASB readers as to the existence and use of these tools, I must confess, my main reason for this post is as a reminder to myself: I'm always forgetting where the damn things are and what the hell they're called.

So there you have it. Remote network (and more!) management via the command-line.

Please resume admin-ing.

UPDATE:
So a scenario occurred to me which might be a bit tricky with regards to the above information: Say you want to change the router setting for your 25 lab macs. Nothing else, just the router. The obvious way to do this would be to send a networksetup command to all of your Macs via ARD. The problem is that the networksetup command has no way of specifying only the router address, without setting the IP address and subnet mask as well. The only way to set the router address is with the -setmanual flag, and this flag requires that you supply an IP address, subnet mask and router address, respectively:

sudo networksetup -setmanual "Built-In Ethernet" 192.168.1.100 255.255.255.0 192.168.1.1

If you send this command to every Mac in your lab, they'll all end up with the same IP address. This will wreak havoc in numerous ways, the worst of which will be the fact that you now have lost all network contact with said lab Macs and will have to go and reset all the IP addresses by hand on each machine. The solution is to specify the current IP address on each machine, which you could do by sending an IP-specific version of the command to each machine individually. But that sort of defeats the purpose of using ARD in the first place. A better way is to substitute a variable in place of the IP address option in the above command. The variable will get the IP address of the machine it's run on. This is what I use:

ifconfig -m en0 | grep "inet " | awk {'print$2'}

This command sequence will get the IP address for the en0 interface, which is what Built-In Ethernet uses. If your machines are on wireless, you can use en1 in place of en0:

ifconfig -m en1 | grep "inet " | awk {'print$2'}

So, instead of entering the IP address in the networksetup command, we'll enter this command sequenc e in backticks. The command we'd send to our lab Macs would look something like this:

sudo networksetup -setmanual "Built-In Ethernet" `ifconfig -m en0 | grep "inet " | awk {'print$2'}` 255.255.255.0 192.168.1.1

This will set the subnet mask and router address — both of which would generally be consistent across machines on any given network — but will leave the unique IP address of each machine intact. You can try this on your own system first to make sure it works. Open the Network System Preference pane, and then run the command in Terminal. You'll see the changes instantly in the preferences pane, which will even complain that an external source has changed your settings. If all goes well, though, your IP address will remain the same while the other settings are changed. You should now be able to safely change the router address of all your machines using this command in ARD (remember to specify the full command path though). Save that sucker as a "Saved Task" and you can do this any time you need to with ease.

Huzzah!

On Backups

Let me just say up front, historically I've been terrible about backing up my data. But I'm working on it.

As far as backups go, I've tried a lot of things. I am responsible for backups of staff data at work, and here is where the bulk of my trials have occurred. For my personal data I've always just archived things to CD or DVD as my drive got full, or as certain projects wrapped up, but I've never had any sort of emergency backup in case of something like a drive failure or other catastrophe. Both at home and at work, though, the main problem I've faced has been the ever-expanding amount of data I need to backup. Combined staff data typically takes up a few hundred gigabytes of disk space. And at home my Work partition (I store all user data on a partition separate from the System) currently uses 111 GB. This does not even take into account the multiple firewire drives attached to my system at any given time. All tolled, we're talking several hundred gigabytes of data on my home system alone. I don't know what "the best" way is to back all this up, but I think I have a pretty good solution both at home and at work.

The Olden Days
Back in the day, staff backups were performed with Retrospect to a SCSI DAT drive. This was in the OS9 days. The tapes each held about 60GBs, if memory serves, and this worked fine for a while. But with the high price of tapes, a limited tape budget, and ever-increasing storage needs, the Retrospect-to-tape route quickly became outmoded for me. It became a very common occurrence for me to come in on any given morning only to find that Retrospect had not completed a backup and was requesting additional tapes. Tapes which I did not have, nor could I afford to buy. Retrieving data from these tapes was also not always easy, and certainly never fast. And each year these problems grew worse as drive capacities increased, staff data grew and tape capacities for our $3000 tape drive remained the same. The tape solution just didn't scale.

Enter Mac OS X
When Mac OS X arrived on the scene, I immediately recognized the opportunity — and the need — to revise our staff backup system. First off, Retrospect support was incredibly weak for OS X in those early days. Second, even when it did get better, there continued to be many software and kernel extension problems. Third, SCSI — which most tape drives continue to use to this day — was on the way out, annoying as hell, and barely supported in OS X. Fourth, the tape capacity issue remained. On the other hand, from what I was reading, Mac OS X's UNIX underpinnings would provide what sounded like a free alternative, at least on the software side: rsync. My two-pronged revision of our backup system consisted of replacing Retrospect with rsync and replacing tape drives with ever-cheaper, ever-larger hard drives.

RsyncX
The only problem with the UNIX rsync was that it famously failed to handle HFS+ resource forks (as did, incidentally, Retrospect at the outset). This situation was quickly remedied by the open source community with the wonderful RsyncX. RsyncX is a GUI wrapper around a version of rsync that is identical in most respects to the original UNIX version except that it is capable of handling resource forks. Once I discovered RsyncX, I was off to the races, and I haven't found anything to date — incuding the Tiger version of rsync — that does what I want better.

My Process
These days I do regular, weekly staff backups using RsyncX over SSH to a firewire drive. For my personal data, I RsyncX locally to a spare drive. This is the most economical and reliable data backup solution I've found, and it's far more scalable than tape or optical media. It's also been effective. I've been able to recover data on numerous occasions for various staff members.

My system is not perfect, but here's what I do: Every day I use RsyncX to perform an incremental backup to an external hard drive. Incremental backups only copy the changes from source to target (so they're very fast), but any data that has been deleted from the source since the last backup remains on the target. So each day, all new files are appended to the backup, and any changes to files are propagated to said backup, but any files I've deleted will remain backed up. Just in case. Eventually, as I'm sure you've guessed, the data on my backup drive will start to get very large. So, at the end of each month (or as needed) I perform a mirror backup, which deletes on the target any file not found on the source, essentially creating an exact duplicate of the source. This is all run via shell scripts and automated with cron. Finally, every few months or so (okay, more like every year), I backup data that I want as part of my permanent archive — completed projects, email and what not — to optical media. I catalog this permanent archive using the excellent CDFinder.

Almost Perfect
There are some obvious holes in this system, though: What if I need to revert to a previous version of a file? What if I need a deleted file and I've just performed the mirror backup? Yes. I've thought about all of this. Ideally this would be addressed by having a third hard drive and staggering backups between the two backup drives. A scenario like this would allow me to always have a few weeks worth of previous versions of my data, while still allowing me to keep current backups as well. Alas, while I have the plan, I don't have the drives. Maybe someday. But for now this setup works fine for most of my needs and protects me and the staff against the most catastrophic of situations.

Consider Your Needs
Still, when devising a backup scheme, it's important to understand exactly what you need backups to do. Each situation presents a unique problem and has a unique set of requirements. Do you need a permanent, historical archive that's always available? Or do you simply need short-term emergency backup? Do you need versioning? What data needs to be backed up and what doesn't? For my needs previous versions are less important; emergency backups are critical. Also you need to consider how much data you have and what medium is most appropriate for storage with an eye towards the future. In my case I have a lot of data, and I always will. Hard drives are the most economical way for me to store my large backups — as data needs grow, so too do drive capacities — but they are also the most future-proof. In a few years we may not be using DVDs anymore, or tapes. But drives will be around in some form or another for the foreseeable future, and they'll continue to get bigger and bigger. And since I'm not so much worried about having a permanent archive of my backup data (except in the case of data archived to optical media), I can continually and easily upgrade my storage by either purchasing new drives every so often, or by adding additional drives as needed. And transferring the data to new media — to these new drives — will be faster than it will with any other media (tape and optical media are slow). This system scales. And while it may be less reliable over the long term than optical or tape, it's plenty reliable for our needs and easily upgradeable in the future.

Lately everyone seems to talking about backup solutions. Mark Pilgrim recently wrote an intriguing post asking how to archive vast amounts of data over the next 50 years. I don't think there's an easy answer there, and my solution would not help him one bit. But it did inspire me to share my thoughts on the matter of backups, and my own personal system. It's certainly not the be-all-end-all of backup systems, and if others have thoughts on this complex and important topic, feel free to post them in the comments. I'd be curious to hear.

Three Platforms, One Server Part 8: A Minor Snafu

So far we've hit only one very minor snag in our migration to a single, unified authentication server for Mac, Windows and Linux. Since Mac and Linux behave so similarly with regards to authentication — in fact, I'd say they're practically identical — and since Windows is so utterly, infuriatingly different, you can expect most of our problems to happen on the Windows side. At this point it should go without saying. This latest snag is no exception.

Today we discovered that applications to be used by network users on Windows machines must be installed by a network user, and this network user must be a domain admin. Put another way, if a Windows application is installed by a local administrator, it will not be fully accessible by users who authenticate via the domain hosted by the Authentication Server. Typical.

The solution is fairly easy, albeit kind of a pain. On each client workstation, you must give the network admin user (i.e., a user whose account exists only on the authentication server) full access to the "C" drive. You heard me: On each computer. Giving full access to a user on Windows requires logging in as a local admin and granting the network user full access rights. Windows then changes the access permissions on every file on the machine, which takes a long-ass time. You heard me: long-ass.

If we had to do this on every machine by hand, I'd be grumpy about it (but I'd do it anyway). Fortunately, we'll be building our Windows boxes from clones, which means we only have to do this once on a master build. the change should propagate via the clones after that. So I'm a pretty happy camper. And we're back on track.

Still, Windows, what the fuck?

And just to be clear on this, I honestly don't know enough about Windows or Active Directory to understand why this is happening. All I know is what we've observed, which is that Windows apps fail to run properly for network users when installed by local users. We tried matching the users and groups we'd had on the old Windows Server, but that did no good. Setting full access locally for a networked user was the only thing we could find that worked. It's very possible, however, that there's a better solution than the one we're using. If anyone can enlighten me, I'm all ears.

Oh, and yes, our network user has full access privileges for the directory domain. Confused? Me too.

More to come...

UPDATE: I found a better solution. A network user can be added to the "Administrators" group via the Users and Groups control panel. Doing so is not particularly intuitive, but it works and saves us the trouble of having to modify permissions on the entire "C" drive.

Windows "Advanced" Users and Groups: By "Advanced" I Think They Mean "Annoying"
(click image for larger view)

It essentially requires logging in as a local Administrator, navigating to the "Advanced" window in Users and Groups, and then adding the user to the "Administrator" group.


Group Properties: Do We Really Need This Window?
(click image for larger view)

Windows doesn't see a network user unless she's logged in, so you have to know how to enter a network user. It should look something like this: DOMAIN\user, where "DOMAIN" is your Active Directory Domain, and "user" is the user name.


Add the User Here: Who'd a Thunk It?
(click image for larger view)

You can find more complete instructions here, which is where I got these images.

Fun stuff.