External Network Unification Part 1: Research and Development

"Systems Boy! Where are you?"

I realize posting has been lean, lo these past couple weeks. This seems to be a trend in the Mac-blog world. I think some of this has to do with the recent dearth of interesting Mac-related news. In my case, however, it's also due to a shocking lack of time brought on by my latest project.

You may or may not remember, or for that matter even care, about my ongoing Three Platforms, One Server series which deals with my efforts to unify network user authentication across Mac, Windows and Linux systems in my lab. Well, that project is all but done, except for the implementation, which we won't really get to do until the Summer, when students are sparse, time is plentiful, and love is in the air. (Sorry, but if you manage a student lab, you'll probably understand how I might romanticize the Summer months a bit.) Anyway, we've got our master plan for user authentication on our internal network pretty much down, so I've turned my attention to the external network, which is what I've been sweatily working on for the last two weeks.

Our external network (which, for the record, has only recently come under my purview) is made up of a number of servers and web apps to which our users have varying degrees of access. Currently it includes:

  1. A mail server
  2. A web host and file server
  3. A Quicktime Streaming Server
  4. A community site built on the Mambo CMS
  5. An online computer reservations system

In addition to these five systems, additional online resources are being proposed. The problem with the way all this works right now is that, as with our internal network, each of these servers and web apps relies on separate and distinct databases of users for its authentication. This is bad for a number of reasons:

  1. Creating users has to be done on five different systems for each user, which is far more time consuming and error prone than it should be
  2. Users cannot easily change their passwords across all systems
  3. The system is not in any way scalable because adding new web apps means adding new databases, which compounds the above problems
  4. Users often find this Byzantine system confusing and difficult to use, so they use it less and get less out of it

The goal here, obviously, is to unify our user database and thereby greatly simplify the operation, maintenance, usability and scalability of this system. There are a number of roadblocks and issues here that don't exist on the internal network:

  1. There are many more servers to unify
  2. Some of the web apps we use are MySQL/PHP implementations, which is technology I don't currently know well at all
  3. Security is a much bigger concern
  4. There is no one on staff, myself included (although I'm getting there), with a thorough global understanding of how this should be implemented, and these servers, databases and web apps are maintained and operated by many different people on staff, each with a different level of understanding of the problem
  5. All of these systems have been built piecemeal over the years by several different people, many of whom are no longer around, so we also don't completely understand quite how things are working now

All of these issues have led me down the path upon which I currently find myself. First and foremost, an overarching plan was needed. What I've decided on, so far, is this:

  1. The user database should be an LDAP server running some form of BSD, which should be able to host user info for our servers without too much trouble
  2. The web apps can employ whatever database system we want, so long as that system can get user information from LDAP; right now we're still thinking along the lines of MySQL and PHP, but really it doesn't matter as long as it can consult LDAP
  3. Non-user data (i.e. computers or equipment, for instance) can be held in MySQL (or other) databases; our LDAP server need only be responsible for user data

That's the general plan. An LDAP server for hosting user data, and a set of web apps that rely on MySQL (or other) databases for web app-specific data, with the stipulation that these web apps must be able to use LDAP authentication. This, to me, sounds like it should scale quite well: Want to add a new web app? Fine. You can either add to the current MySQL database, or if necessary, build another database, so long as it can get user data from LDAP, as user data is always redundant and should always be consistent. It's important to remember that the real Holy Grail here is the LDAP connection. If we can crack that nut (and we have, to some extent) we're halfway home.

This plan is a good first step toward figuring out what we need to do in order to move forward with this in any kind of meaningful way. As I mentioned, one of the hurdles here is the fact that this whole thing involves a number of different staff members with various talents and skill sets, so I now at least have a clear, if general, map that I can give them, as well as a fairly clear picture in my mind of how this will ultimately be implemented. Coming up with a plan involved talking to a number of people, and trying out a bunch of things. Once I'd gathered enough information about who knew what and how I might best proceed, I started with what I knew, experimenting with a Mac OSX server and some web apps I downloaded from the 'net. But I quickly realized that this wasn't going to cut it. If I'm going to essentially be the manager for this project, it's incumbent upon me to have a much better understanding of the underlying technologies, in particular: MySQL, PHP, Apache and BSD, none of which I'd had any experience with before two weeks ago.

So, to better understand the server technology behind all this, I've gone and built a FreeBSD server. On it I've installed MySQL, PHP and OpenLDAP. I've configured it as a web server running a MySQL database with a PHP-based front-end, a web app called MRBS. It took me a week, but I got it running, and I learned an incredible amount. I have not set up the LDAP database on that machine as yet, however. Learning LDAP will be a project unto itself, I suspect. To speed up the process of better understanding MySQL and PHP (and foregoing learning LDAP for the time being), I also installed MRBS on a Tiger Server with a bunch of LDAP users in the Open Directory database. MRBS is capable of authenticating to LDAP, and there's a lovely article at AFP548 that was immensely helpful getting me started. After much trial and error I was able to get it to work. I now have a web application that keeps data accessed via PHP in a MySQL database, but that gets its user data from the LDAP database on the Tiger Server. I have a working model, and this is invaluable. For one, it gives me something concrete to show the other systems admins, something they can use as a foundation for this project, and a general guide for how things should be set up. For two, it gives us a good idea of how this all works, and something we can learn from and modify our own code with. A sort of Rosetta stone, if you will. And, finally, it proves that this whole undertaking is, indeed, quite possible.

So far, key things I've learned are:

  1. MySQL is a database (well, I knew that, but now I really know it)
  2. PHP is a scripting/programming language that can be used to access d atabases
  3. MySQL is not capable of accessing external authentication databases (like LDAP)
  4. PHP, however, does feature direct calls to LDAP, and can be used to authenticate to LDAP servers
  5. PHP will be the bridge between our MySQL-driven web apps and our LDAP user database

So that is, if you've been wondering, what I've been doing and thinking about and working on for the past two weeks. Whew! It's been a lot of challenging but rewarding work.

This is actually a much bigger, much harder project than our internal network unification. For one, I'm dealing with technologies with which I'm largely unfamiliar and about which I must educate myself. For two, there are concerns — like security in particular — which are much more important to consider on an external network. Thirdly, there are a great many more databases and servers that need to be unified. Fourth, scalability is a huge issue, so the planning must be spot on. And lastly, this is a team effort. I can't do this all myself. So a lot of coordination among a number of our admins is required. In addition to being a big technical challenge for me personally, this is a managerial challenge as well. So far it's going really well, and I'm very lucky to have the support of my superiors as well as excellent co-systems administrators to work with. This project will take some time. But I really think it will ultimately be a worthwhile endeavor that makes life better for our student body, faculty, systems admins and administrative staff alike.

Notes on Live Quicktime Streaming

There are a few details to setting up a Quicktime Streaming Server for live broadcast that I always forget, particularly when it comes to authentication. I just went through the irritation of setting this up for the umpteenth time (it's an annual occasion), and thought I'd post some quick, pertinent notes on the process so that next year I can get the info here instead of scouring the 'net.

Quick Notes:

  • In order to do live streaming via Quicktime Broadcaster's Automatic Unicast there must be a user or group authorized to do so in the qtusers or qtgroups files located in /Library/QuickTimeStreaming/Config/
  • Create users with the qtpasswd command (qtpasswd -h for help; qtpasswd username to set the password)
  • Any user in the qtusers file can do live streaming to any directory in the Movies folder that has the properly configured qtaccess file
  • The qtaccess file looks like this:

    <limit WRITE>
    require user streamuser
    </limit>
    require any-user

    where "streamuser" is the name of the user who is authorized to stream

  • Finally, the live streaming directory must be owned and read-/write-able by the user qtss
  • Live streaming can then occur by simply pointing Quicktime Broadcaster to the live streaming directory on the specified server with the username and password of the user specified in the qtaccess file


Quicktime Broadcaster: Automatic Unicast Settings
(click for larger view)

That's it. Those are the sticking points for me every year. Hopefully, next year I can just check back here and get up and running in a few easy steps.

If not, here are some good links:
Webcast How-To
Authentication
Apple Developer

Re-Binding to a Mac Server

Last semester we had a lot of problems in the lab. Our main problems were due to two things: the migration to Tiger, and problems with our home account server. Our Tiger problems have largely gone away with the latest releases, and we've replaced our home account server with another machine, and, aside from a minor hiccup here and there, things seem to have quieted down. The Macs are running well, and there hasn't been a server disconnect in some time. It's damn nice.

There has been one fairly minor lingering problem, however. For some reason our workstation Macs occasionally and randomly lose their connection to our authentication server — our Mac Server. When this happens, the most notable and problematic symptom is users' inability to log in. Any attempt at login is greeted with the login screen shuffle. You know, that thing where the login window shakes violently at the failed login attempt. This behavior is an indication that the system does not recognize either the user name or the password supplied, which makes sense, because when the binding to the authentication server is broken, for all intents and purposes, the user no longer exists on that system.

I've looked long and hard to find a reason for, and a solution to this problem. I have yet to discover what causes the systems to become unbound from the server (though I'm starting to suspect some DNS funkiness, or anomalies in my LDAP database as the root cause at this point). There is no pattern to it, and there is nothing helpful in the logs. Only a message that the machine is unable to bind to the server — if it happens at boot; nothing about why, and nothing if it happens while the machine is on, which it sometimes does. It's a mystery. And until recently, the only fix I could come up with was to log on to the unbound machine and reset the server in the Directory Access application. Part of my research involved looking for a command-line way to do this so that I wouldn't have to log in and use the GUI every time this happened, as it happens fairly often, and the GUI method is slow and cumbersome, especially when you want to get that machine back online ASAP.

It took me a while, but I have found the magic command, at a site called MacHacks. Boy is it simple. You just have to restart DirectoryService:

sudo killall DirectoryService

This forces the computer to reload all the services listed in the Directory Access app, and rebind to any servers that have been set up for authentication. I've added the command to the crontab and set it to run every 30 minutes. That should alleviate the last of our lab problems.

Hopefully the rest of this semester will be as smooth sailing as the past two weeks. I could use a little less systems related drama right now.

Three Platforms, One Server Part 4: Redundancy

One of the major hurdles in our server unification project, mentioned in Part 1 of this series, is that of redundancy. In the old paradigm, each platform's users were hosted by a separate server. Mac users authenticated to a Mac Server, Windows users to a Windows Server, and Linux users to an NIS server. While this is exactly what we're trying to avoid by hosting all users on a single server, it does have one advantage over this new approach: built-in redundancy. That is, if one of our authentication servers fails, only the users on the platform hosted by said server are affected. For example, if our Windows Server fails, Windows users cannot login, but Mac users and Linux users can. In our new model, where all authentication for all platforms is hosted by a single server, if that server fails, no user can log in anywhere.

Servers are made to handle lots of different tasks and to keep running and doing their jobs under extreme conditions. To a certain extent, that is the very nature of being a server. To serve. Hence the name. So servers need and tend to be very robust. Nevertheless, they do go down from time to time. That's just life. But in the world of organizations that absolutely must have constant, 24 hour, 'round-the-clock uptime, this unavoidable fact of life is simply unacceptable. Fortunately for me I do not inhabit such a world. But, also fortunately for me, this notion of constant uptime has provided solutions to the problem of servers crashing. And while I probably won't lose my job if a server crashes periodically, and no one is going to lose millions of dollars from the down-time, no SysAdmin likes it when he has to tell his users to go home for the night while he rebuilds the server. It just sucks. So we all do our best to keep key systems like servers available as much as possible. It's just part of the deal.

So how are we going to do this? Well, one of the reasons I decided to use a Mac for this project is that it has built-in server replication for load balancing, and, yes failover. We're not too concerned with the load balancing; failover is what we're after. Failover is essentially a backup database that is a replica of a primary database, and that takes over in the case of a failure of the primary database. Mac Server has this built-in, and from what I read, it should be fairly easy to set up. Which is exactly what we're about to do.

The first thing we need is our primary server. This is the main server. The one that gets used 99% of the time (hopefully). We have this (or at least a test version of it) built already as discussed in Part 1. What we need next is what is called the replica. The replica is another Mac OSX Server machine that is set to be an "Open Directory Replica," rather than an "Open Directory Master."

So I've built a plain old, vanilla, Mac Server, and set it initially to be a Standalone Server. I've given it an IP address, and done the requisite OS and security upgrades. (Oy! What a pain!) In the Server Admin application, I set the new server to be an "Open Directory Replica." I'll be asked for some information here. Mainly, I'll need to tell this replica what master server to replicate. Specifically I'm asked to provide the following at the outset:

IP address of Open Directory master:

Root password on Open Directory master:

Domain administrator's short name on master:

Domain administrator's password on master:

(The domain administrator, by the way, is the account used to administer the LDAP database on the master.)

Once I fill in these fields I'll get a progress bar, and then, once the replica is established, I'm basically done. There are a few settings I can tweak. For instance, I can set up secure communications between the server with SSL. But for my purposes, this would be overkill. I'm pretty much going with the out-of-the-box experience at this point. So for setup, that should be it. Setting up a replica is pretty easy stuff.

Establishing the Replica: Could it Be Any Easier?

(click for larger view)

Now here comes the fun part: testing. What happens if our primary server goes offline? Will the replica take over authentication services? Really? I'd like to be sure. What I'm going to do now is test the behavior of the Master/Replica servers to make sure it acts as intended. The best way I know to do this is to simulate a real-world crash. So I am binding one of my clients to my Master server, with Replica in place. Then I'm going to pull the plug. In theory, users should still be able to login to the bound client. Let's try it...

Bang! It works! I'm a bit surpsrised; last time I tried it, years ago, it (or I) failed. This time, though, it worked. We bound a client to the Master, our mother-ship server. Authentication worked as expected. (We knew we were bound to the new server because the passwords are different.) And then we killed it. We killed the master and logged out. There was some beachballing at logout. But after a few minutes -- like two or three, not a long wait at all -- we were able to complete logout, and then log right back in as though nothing had happened. I tell you, it was a thing of beauty.

So let's briefly recap where we've been and what's left to do.

Where we've been:

  • We've built our Mama Server. Our authentication server for the entire lab.
  • We've figured out how to migrate our users to Mama, and how to handle the required password change.
  • We've solved the inherent problems with Windows clients and figured out a few solutions for handling them involving quotas and various roaming profile locations.
  • We've built and tested the operation of the Open Directory Replica, and it is good.

What's left to do:

  • Well, honestly, not a whole Hell of a lot.
  • The next step, really, is real-world testing. We have a basic model of how our servers and clients should be configured, and it's basically working. To really test this, we'll need to take some actual clients from the lab and set them up to use the new system.
  • Stress testing (i.e. seeing if we can break the system, how it holds up under load, etc.) would also be good, and might be something to do over Winter break a bit, and definitely in the Summer. To do this, we'll need to set up several client systems, and get users (guinea pigs) to do some real work on them all at the same time.
  • Once stress testing is done, if all is well, I'm pretty sure we can go ahead and implement the change. I can't foresee any other problems.

So I'm at a stopping point. There's not much else I can do until the break, at which point I'll be sure and post my test results.

Hope to see you then!

Three Platforms, One Server Part 3: Another Quota Solution

Another solution to the problem of quotas occurred to me today: local quotas. Since the Windows workstation copies everything in the roaming profile from the server to the local machine at login, and uses that data as the user works, it actually makes a lot more sense for quotas to happen on the local workstation itself. This way, rather then becoming aware of quotas at logout only, the workstation is always aware of the state of the user's home account quota, because that quota exists on the volume where the profile actively lives.

Doing this also prevents the error message at logout due to exceeding a server-side quota. In this new, local-quota scenario, the user is alerted the instant he exceeds his quota and can rectify the situation immediately. But if he doesn't, no problem. As long as the local quota and the server-side quota match, he'll never run into the problem of being unable to save his settings to the server, since he'll never be able to exceed his quota anyway.

It turns out that, unlike on Mac OS X, setting quotas for NTFS volumes is incredibly easy. In fact, it's a simple matter of viewing the properties of the drive on which you want to set quotas.


Windows Quota Settings: Easy as Pie
(click for larger view)

Here, go to the Quotas tab and click "Enable quota management" and "Deny disk space to users exceeding quota limit," set the limits, and you're basically done. The administrator will have no quotas, and any new user will have the quotas specified in this panel. You may, however, want to set certain users (particularly other admin or local users) to have larger quotas, or none at all. Again, exceptionally easy: Click that badly named, but ever-useful little "Quota Entries..." button and you'll get a list of local users and their quotas.


Windows Quota Entries: Edit Specific User Quotas
(click for larger view)

Here, you can set quotas for specific users. Initially, you'll only see local users. But after any network user logs in, you'll also see them listed as well.

The more I think about it, the more I like the idea of local Windows quotas. Using them does away with all the problems mentioned in earlier posts, and may help with a lot of the quota-related problems users have with our current, Windows-server-based system. Also, this would allow me to store all profile-related stuff in the same place for Windows as for Mac and Linux -- on our home account RAID -- as I'd wanted to do in the first place. And, in general, it should be much easier -- or at least not too difficult -- to maintain.

A last note on the timing of this project: I just spoke with my boss about the fact that I'm way ahead of schedule and have been thinking about implementing our unified password server over the Winter break. Lately I've had some misgivings about doing this, and he echoed those sentiments. His take was basically, "If it ain't broke, don't fix it." Always sage advice. And as gung-ho as I am to see this working, giving it some time and implementing it over the Summer will give me more time to plan this more carefully and test it more thoroughly, so that we can iron out problems at the beginning of the school year -- when students' work is not quite so in-progress -- rather than at the end -- when students are trying to finish their theses. This seems like a wise approach, and at this point I'm leaning toward erring on the side of caution.

Finally, credit where due: In-depth information on NTFS quotas can be found on this Microsoft page, which is where I lifted the images in this post 'cause I'm too lazy to figure out how to get good screen captures in Windows, and 'cause those images described perfectly what I wanted to show. I believe the images are in the public domain and, therefore, legally usable in my post, but if I'm wrong, I apologize, and if I need to take them down, someone can let me know in the comments section.