Results tagged “plan”

CNS Mail Migration

Okay, for those of you that read my blog, the number of which usually astonishes me to be above single digits, I thought you might be interested in getting a personal take on the mail migration project I'm working on for the CIS department.

So, back the beginning. When I first started with CIS I wasn't actually that pleased to be running mail services. Mail is one of those services that has to just work, but also requires more babysitting than would seem necessary. Let me use a diagram to demonstrate why:

Now, this is vastly simplified, but gets the jist of what happens when a foreign user sends an email to an account in CIS. I start with the first step of the process that involves us (i.e., I don't really care what kind of setup the other guy has ;) Anyway, the first step is for their server to execute a DNS lookup to find the MX (Mail eXchange) record for "cis.ksu.edu" since that's the last part of the email address. You can do the same on any of our Linux boxen (or just about any Unix box, for that matter) via:

$ host -t MX cis.ksu.edu
cis.ksu.edu mail is handled by 10 mustang.cis.ksu.edu.

That is, they contact their local DNS server, which then does the whole complicated process of finding our DNS server and then asks us for the MX, which responds with mustang, which is the "real" name of the machine we call "smtp.cis.ksu.edu." The 10 part is the priority of the mail server, so that if we had multiple mail servers we could have them listed by priority.

Once the foreign host knows the mail server, it initiates a direct connection to that mail server via a communication protocol called SMTP. We use a program called Sendmail to provide a server that communicates this protocol. Among other checks, it makes sure that the email being delivered belongs to a local user. To do this, it contacts our authentication server (through another convoluted process) to see if the user exists and performs delivery if she or he does.

Mail delivery continues when Sendmail starts another program called Procmail. In the typical case, this program runs another program called SpamAssassin to check to see if that program decides to see if it spam or not. Then, Procmail either finds the user's mailbox (a directory named "Maildir" in the user's home directory) or executes a script named ".procmailrc" and performs the actions requested by that script. Again, I'm simplifying for brevity.

Procmail writes to the file server (using the NFS protocol) to finally deliver your mail to your mailbox. For you to get your mail, you start up your favorite mail program (be it Microsoft Outlook, Mozilla Thunderbird, Netscape Mail, Pine, or Mutt), your mail program logs into the IMAP server (though Pine and Mutt can read the mail straight out of your home directory, but we'll not cover that case, again, brevity!), which contacts the ActiveDirectory server with your given username and password (did mention this is convoluted?) to verify your identity. Then the IMAP server uses NFS to read the contents of your mailboxes from the file server.

The problem being that if any one of these systems (or their subsystems) fails, part or all of the process stops. If the DNS server goes down, gets misconfigured when we make configuration changes, or because we lose network connectivity, email stops being delivered and possibly/probably you can't check/send email anymore either.

If the Sendmail server goes down, gets misconfigured (rarely), or the computer the server runs on gets too busy, mail stops being sent and stops being delivered. Other problems can occur when SSL keys expire, get accidentally changed during updates, or your local Anti-Virus program decides to interfere with communication.

If the Microsoft Windows Server running Active Directory goes down, becomes inaccessible, or a critical update fails you can't login and you can't send or check email because none of the machines are able to verify your identity.

If you modify your ".procmailrc" file and bugger it up, you won't have your mail delivered (or in the case of mistake I made this week, all your mail could be delivered to the wrong place). Generally, we don't have to mess with Procmail often, so this isn't something we break very often.

SpamAssassin, on the other hand, has proven to be a pretty fragile program. We actually have it set to restart on an extremely frequent basis because it would occassionally crash. It takes a lot of processing power to do the checks it does, so this takes a pretty heavy amount of processing on the mail server.

If some part of file services goes down, obviously, nothing works again.

I could also mention PAM, NSS, LDAP libraries, SASL libraries, libc libraries, OpenSSL, various hosts running various services, etc. If one goes down, half the system goes.

We could probably keep things up and running for a long time if we could keep ourselves from touching configuration files and installing updates all the time. However, that's assuming the scriptkiddies and the hackers would leave our boxen alone if we didn't keep them up to date, so we either risk breaking services ourselves in predictable and generally quickly reparable ways, or risk damage that requires complete rebuilds.

Furthermore, our user base is small enough that I don't have staff working in Nichols 24 hours. CNS has staff on call 24 hours a day and a night crew that spends part of their time monitoring systems. In fact, I'm really the only employee at this time who understands most of the inner workings of our mail system. It's not that it's extraordinarily complicated as mail systems go (as it's pretty simple), but that this simple mail system is more complicated than I would expect an hourly student to take the time to understand. CNS has had some notable downtime in the last year or so, but our downtime has been considerably worse.

The benefit of doing this for ourselves is the ability to be flexible. We can set our own policies that best suit our small department. On the other hand, we pay for that flexibility in greater amounts of down time. As such, I'm for the move to CNS despite the fact that there will be a number of inflexible policies that we will have to cope with because it will mean a better overall uptime. It will also mean that I have more time to work on improving the systems that are really more important to what CIS does, even if not as critical (if that makes sense).

This has been a fairly politically charged move as email is one of those things that has to just work. CIS faculty have some particularly difficult needs to accommodate in the way of mail quotas, attachment sizes, pre-filtering, spam protection, etc. CIS faculty are also very picky because most of them are pretty knowledgable about how these things can and should work. Some of our faculty also have some pretty eccentric work environments configured and aren't interested in relearning to do things in a different environment. I won't argue these things good or bad, that's not my business, but I can say that accommodating all of these disparate desires has been quite a task. (And I haven't even mentioned the students, but then, they come and go so

However, the major obstacles to this move have been overcome. There will still be parts of this that some will not like, including me. However, I am convinced that the benefits outweigh the problems.

I hope this "little" blog helps explain things a bit more clearly. Cheers.

Contentment is currently running two sites: Contentment.org and ~Sterling. I'm currently updating the CIS Support Site to also run under the new software. The process of conversion is pretty simple since a lot of what Contentment is came from the "knowledge base" used there.

I've discovered a few minor flaws in Contentment at this point that will need to be addressed soon. Primarily, to this point, the forms system that I've built is rather badly unfocused. I made some assumptions (which I stupidly did not document), which no longer hold. As such, I need to refocus the design of this subsystem as it is a major feature. This has come into play because two of the most important pages on the CIS Support Site are the password change and account application forms.

The forms system is actually very good at handling complex forms situations, but as for simple stand-alone forms, it sucks. Anyway, I just wanted to cover the basics of web forms and why they are a complex situation to be resolved.

Current HTML forms are dumb. Eventually XForms will replace them, but until then this system operates within the restrictions of HTML forms. HTML forms have a very small set of possible controls: text boxes, password boxes, checkboxes, radio buttons, buttons, popup boxes, list boxes, file uploads, and text area boxes. The HTML form also assumes that a given form only addresses a single action at a time, which is a very difficult limitation to live with if you want to do something even moderately complicated (like updating more than one record at a time).

This limitation can make having multiple nested forms very difficult or impossible (especially since a form tag cannot be embedded within another form tag). Even having multiple separate forms on a page can lead to complications. As such, I created the idea of "panes" where each form exists within a pane. Then, a pane may contain other panes and thus nested forms. Upon submit, the forms processor is run and processes all submitted panes that have had their "activate" field set. This actually works alright, but what if we don't want to use a pane? Well, that isn't handled so well.

Another problem with forms is what to do with the form when we're done with it. For example, what if we want to have a login form. We might want to have logins available on a stand-alone page or as a part of the side-bar. If a stand alone, where do we go after login? If a side-bar, do we return to the current page on success? Do we go to a special error page on error, or do we just report the error in the sidebar on the current page? Thus, I added a "map" system that allows pluggable module to determine how each action state is handled. This is a bit of overkill for very simple forms.

Beyond this, the system is a little esoteric. It requires a lot of extraneous API knowledge that could be simplified. There are also a lot of features that are a little out of place because of the time span that was involved from the start of the forms system and today (I started the forms system early last year). Anyway, this thing needs a bit of work and will probably be pretty well rewritten for the 0.9 release.

My next plans for the current revision are to reconstruct my blog aggregator and create a blogging plugin. I was thinking of rewriting the VFS to add it's first non-real-file-system plugin, but I think I'll create the plugins using the file system first and then try the VFS update afterward. I think it will make the VFS plugins a little more interesting if I start without them and then add them in later.

I need to also start posting documentation on the Contentment.org web site this week so all of this makes sense without having to read the code. So much work to do and so little time...

Neat. ~Sterling has been rebuilt using the latest CMS to take the Open Source world by storm...er something. Okay, so the "secret" project I've been working on for years is finally doing something because I decided to stop being "elegante" and decided to JustMakeItWork(tm).

What have I done? Well, I wrote a little tool to help me manage the CIS Support Site for work. This little tool is a combination themer/indexer for static pages. It also does some on-the-fly generation of HTML from reStructuredText, which is what we write most of our docs in. It seemed pretty useful and is similar to the software I was using to run this site before October 2004, called Blosxom, which is a lightweight file-based blogger.

Anyway, when I had some trouble getting access to K-State Online for my course last semester, I decided to try and dump my course data into it on ~Sterling. With a few modifications it worked quite well and quickly supplanted any previous ideas I had about the content management systems I'd been toying with for the past several years.

Most of the work is already done by HTML::Mason. My system just took advantage of the features already present to add indexing and theming and generation of content from other formats. ~Sterling took it a bit further by adding the ability to generate even more complicated content (especially, ripping apart zipped Keynote "files" and using XSLT to generate HTML outlines).

At this point, I ran into a few issues:

  1. Adding new generators was requiring lots of custom code and my indexing code was becoming convoluted.
  2. New content had to be added with care, otherwise Mason would try to interpret files it had no business touching. When this happened, my indexer would basically bring the entire site to it's knees with a single exception.
  3. Some content is just better stored in a database. Blog entries, news items, and simple records are just a few examples. The system had no way of coping with any of these.

Thus, since about the time I put Drupal on this site, I've been working on a replacement. Drupal is merely a temporary expedient. I started completely from scratch, but have dragged in a lot of the bits from the existing "knowledge base" system to build this new system, which I am dubbing as Contentment (superceding all the predecessors I'd created and called this).

This system currently features a lot of unused features, but most of the good ones are employed currently. One of the best features I just added this week and after just a few days it's practically remade the quality of the system. Specifically:

  1. It features a (largely unused, as yet) forms handler that can help design forms and wizards with a fairly small amount of effort. I borrowed a lot from the kinds of work that Everything has done in this area.
  2. It uses the SPOPS object-mapping system to provide a database API. It's not required that new plugins use this API, but all the existing database pieces use it.
  3. The system automatically provides for context, sessions, and logins. The user accounting system is completely pluggable, so new support for LDAP or other login types could be added with a little effort.
  4. The system provides a basic permissions system. All of these features have been designed to make adding database-based plugins possible, but there really aren't any yet.
  5. The major feature that has really made the system work despite the lack of any database plugins is the VFS system I've put together. I've debated whether or not this should be forked into a separate project, but I'm going to leave it where it is for now. Anyway, this enhances Mason's abilities by quite a bit and allows for a much more general way of looking at files. This way, Mason no longer has primary control of generating files, but passes that control off to other plugins.
  6. Right now the system works via CGI, but I'd like to put together a mod_perl front-end to take advantage of those features. I've designed everything to this point with mod_perl in mind, so it should work with minimal effort.

That's a pretty bad mish-mash summary of the features. There's a lot more I could say, but I'll save that for documentation. I'm going to admit that I've had a SourceForge project for this for eons, but that it'd never really worked until now. I'm so excited about how this is going now, that I have registered Contentment.org and will be posting information and documentation there. I'm going to, for now, use the mailing lists, bug trackers, announcements, and CVS repository at SourceForge. (Though, I'm hunting hard for a way to keep it in Subversion as I strongly prefer it, despite it's performance and other issues.)

Anyway, I wanted to announce that and say that Drupal may be saying farewell to this site soon---if I can get the plugins written and translate all of my Blosxom and Drupal entries into my new plugins.

~Sterling is Coming Along

For the last few weeks, I've been working on my work web site, ~Sterling to improve it's capabilities. The purpose of this update is many-fold. Read on if you want to learn about my software (you may want to read this post first)...

My web site is based upon the "Knowledge Base" software I developed for the CIS Support Site. This has made keeping this web site up-to-date considerably easier. The software started as a handy index generator that generates indexes of pages posted into the system automatically. Additionally, it's able to perform some basic transformations on input files to generate HTML or RSS or whatever. It's a very minimal content manager that mostly relies upon the features of Mason for most of its functionality.

Okay, so I branched the system to create ~Sterling, which was the repository for the CIS 450 web site last year. I added some more features to make it automatically detect multiple file formats with the same basename minus suffix (e.g., Session-01.pdf, Session-01.pdf, Session-01.key.zip). This made it possible to view multiple versions of a file and using the transformation bits, I was able to generate HTML summaries straight from my Keynote presentations since the format is based around an XML file.

Problems: (1) I now have two versions of the "knowledge base" software that I wanted to have the same features. (2) Adding content always requires dumping files into the system, which is overkill for items like blog/news entries, blog aggregation, etc. (3) One bad file spoiled the whole web site—any index page that found a badly formatted Mason file caused the index to puke. (4) The transformation system was a kludge and required careful tweaking and depended dubiously on file suffixes.

Thus, I endeavored to rewrite the software and have now decided that it is satisfactory for my content management dreams of the past several years. It's not quite ready for distribution as I need to add a little bit more documentation, but I anticipate that in the next month or so, I'll be updating the woeful Contentment project page.

I should be able to take the existing CIS Knowledge Base and drop it into the new system with very little effort. My ~Sterling page is going to be repopulated over the next couple weeks with the original files, but using the new improved transformation system. I still need to migrate the indexing system from the old knowledge base to the new one, but that should be a relatively simple matter (and this time, errors will be handled gracefully!)

The new transformation system is really the key. I borrowed a lot of ideas from Cocoon since I've always kind of liked that system. Basically, each file in the system is first checked to see what input "kind" it has (determined by a set of Mason plugin components, which pick the "kind" from file suffix, file contents, etc.). Based on this information, the file is run through a "generator," which translates the file into an initial kind. The system only has two generators right now: Mason (runs the file as a Mason component) and the fallback generator which just reads the file as is.

Then, the transformation system is applied to each input file. The transformation system attempts to find a sequence of transformations that can be applied to the file to get it from it's initial kind to the requested final kind (which is determined by more plugin Mason components, usually based upon the URL or query parameters). If no transformation can be found, then you'll get a 404, otherwise, it attempts to find the best transformation using a shortest path search (which is probably too costly, but works fine for now with a very small number of transformers). The transformers are applied to get the final output file.

Yet, we're not done yet. Finally, based upon the final output kind, another group of components are applied to the output, called "filters," which further modify the file. The main reason for this is that HTML files coming through the system need to have some links fixed, etc. before output. I had intially thought that this would be a good place to put the theme engine too, but I've decided against for now until I can come up with a decent policy for regulating how themes should work. I have a theming engine in place now, but it depends upon the "autohandler" feature of Mason, in the same way as the previous version of the knowledge base.

Once all the filters have been applied, the file is finally output (after being passed through any autohandlers, as per normal Mason operation).

I should have the system adapted to take any index and turn it into a proper RSS feed and I also want to add Atom feeds this time round as well (and better put in the "alternate" links to let browsers know it's there).

The next step is to add in the database features so that the web site can store some content into MySQL. This will make a lot of the routine updates to the site much easier. A file manager would also be handy to allow users to upload and manage files through a web browser, which is another goal of mine.

If all goes well, I may be dropping Drupal for Contentment in a couple months. I won't hold my breath though...

~Sterling, TNG

I'm bored with my CIS web site and a little tired of maintaining it. For anyone looking for it's content, it still exists and I'll be bringing it back, but I'm trying to put together the next generation content management system for my web site.

History: Okay, this is pathetic. I am so picky about my crud that I have literally spent the last 7 years trying to put together a content management framework for web design as my number one pet project. All of that work and I have very little to show for it. In that time, I've restarted at least once a year.

The original reason for all this work is that I wanted to have a journal of articles and essays that I updated on a regular basis, but wouldn't have to maintain the index. I wish I could say that 7 years ago, I started blogging before "blogging" was even a part of Internet jingo, but I did want to do it 7 years ago.

More History: I've been "programming" since I was able to read. Something around five years old. I wasn't really programming per se, but I did know how to copy programs from Compute Magazine. When I was eleven, my friend Lucas gave me a (pirated) copy of Turbo Pascal 6.0. He also had a tutorial he'd gotten off of a local BBS to learn Pascal. From there, I got my one wish for my sixteenth birthday, Borland C++ 3.1 with Application Framework and then taught myself C and C++. (Yes, I have been geek since I was five years old.)

At K-State, because of the leadership of Dr. Schmidt, I learned Java and became completely emersed in the language. It was around this time, while working as a consultant for Network Resource Group, I began trying to put together a CMS. It wasn't called a CMS back then, but that's what it was.

Over time, I got frustrated with the restrictiveness of the Java language. I always felt like I was trying to wrap my ideas around the Java language instead of just executing them. Java is so very verbose, but it does provide a standard library for doing everything. Anyway, my first attempt went up in flames.

I then had my first, brief, flirtation with Cocoon while I was taking CIS 726 as an undergraduate and built a very small implementation of my personal web site with it. Cocoon was cool, but freaking slow!

I then began my trek for a language that could express what I wanted directly. I wanted a language I can think in. I tried out PHP. PHP is real slick. Last I checked, NRG is still using the second web site I designed for them using PHP. We need some login functionality for clients so they could check on the antivirus email service we were planning to supply (though, I understand they've since dropped that service). Anyway, I replaced a web site that took me two weeks to put together (in Java and JSP) with a new one in PHP in less than four hours. Awesome!

Therefore, I embarked on attempt number three in PHP. This is when I came with the concept of a content management system, which everyone else obviously copied from me. ;) This is when I invented, Contentment, as a framework. Unfortunately, my first concept for Contentment was doomed from the beginning. PHP just wasn't elegant enough for me to do anything but RAD. I like PHP and would use it again, but I will not build frameworks in it. Too much spaghetti.

This and the fact that PHP isn't really good for anything but web templates (though, I understand PHP 5 is changing that) brought me back to my search for a language. At NRG, I had messed with Perl a bit for the antivirus stuff (we were extending Anomy Mailtools) and decided to give it a try. I also gave Python a shot. Both are nice languages. Python offers some very elegant scripting features and freedom from static typing. (I can just hear Dr. Stoughton and Dr. Banerjee cringing at that thought.) Python is just too....ugly for me. Yes, I said ugly. It feels clunky and I have to think to hard to do things their way. It's the same old issue I had with Java.

Perl, on the other hand, is like God's gift to the disorganized thinker. I can think in Perl. I write code like I write a term paper. It's easy for me to read and easy for me to write. It fits me. Unfortunately, it too, is very clunky. Many of the language features are hacked on. Imagine an object oriented language where the objects are just things that are "blessed" with a name. Thus, you could have an array or a regular value or a map "blessed" as the same type but have completely different ways of being used. Ick. So, I cope with the clunk because I know Perl 6 is coming.

Anyway, Perl brings me back to my point. I started an implementation in Perl. I failed. I started another implementation in Perl. I again couldn't stand my own code, and started again. Around this time I discovered Mason, which is a really wonderful tool for embedding Perl into source code and is a mini-content management system of it's own, sort of. I tried again with a Perl/Mason combo and again failed.

I then decided, let's go practical. I'll implement something that works RIGHT FREAKING NOW. Ta-da! It worked!....almost. If you go to the CIS Support Site, you can see my handiwork. It works. The original ~Sterling web site featured a forked version of the same software that was improved in a lot of ways from the original. Unfortunately, both of these were still lacking and slowly became less and less manageable—i.e., they didn't scale as well as I'd hoped.

Thus, I have endeavored to use the best of those tools and added new features for database support and a new forms system to try and make the system more scalable. Unfortunately, I've again tried to go too abstract again and not concrete enough.

This time, I'm going to do it right. I'm going to take the best of my third generation Mason/Perl software and my attempted fourth generation Mason/Perl software to build the fifth generation.

This is why ~Sterling is now a blank slate. I'm going to build it up from scratch and I will make it beautiful, or I will die trying. May God have mercy on my soul.

System Plans

Now that I've finished the bulk of my Master of Science program, it's time to consider the next move for the systems. I've made most of the major changes I wanted to coming into this position, now I'd like to make those changes robust. So, the coming year's theme: stability.

I believe we must examine, to achieve the goal of stability, there are four aspects of the systems (in no particular order):

  • Network infrastructure
  • Server infrastructure
  • Security infrastructure
  • Information infrastructure

Network Infrastructure: This is the foundation of our systems. In the Department of Computing and Information Sciences, this is a completely uknown quantity. None of our current staff can state with certainty the current network's history or current quality—we just don't know. The students and myself have not been around long enough to know. Our hardware specialist, Earl has been around long enough to know quite a bit, but still doesn't really know the network infrastructure well enough to gauge it's condition. Before we can really begin to guarantee quality anywhere else, we will need to quantify the quality of the network. Once quantified, we will need to address any problems found. This research will begin this month and should be finished by May. At that time, we make any changes we can to stabilize this aspect of our systems.

Server Infrastructure: Work on this aspect has already begun. Our server systems are actually in fairly good shape, but still require tweaking to bring the system completely into the 21st Century. We'll be moving to a new web server, the old web server will be handling email services, other services will be moved around, and our obsolete hardware will be retired or placed on standby to be retired in the future. Most notably, our Solaris fleet is made entirely of computers that were never meant for server use and reached End-Of-Life years ago. If we're to continue our use of Solaris, new servers must be purchased. Solaris used to be our flagship system because Sun offered such great support. Yet, as of now, the support offerings from Sun don't look to be that compelling, in my opinion.

Another notable aspect of this issue, is the problem of file services on our systems. For the duration of my time in the Systems Coordinator position, we have been plagued with NFS problems. I intend to resolve this via one of several solutions. The first solution is to attempt to remove the current Solaris NFS server from our systems. It seems to be a major cause of problems on our Linux workstations (it's actually got to be a bug on Linux, but a probable work-around is to remove the Solaris NFS server). This should solve the current stability issues.

Security Infrastructure: Account management and authentication was a major focus of my M.S. Report. These were, really, the largest changes I have implemented on our systems. These have resulted in the most significant discomfort to users as well. Hopefully, most of the pain associated with these changes has ended. The account manager has not been fully implemented. During the next few weeks, it will be set in place. This should decrease the amount of effort spent performing account maintenance and allow us to monitor accounts much more closely than is currently possible.

As mentioned above, NFS is a stability problem on our system. However, while the solution given above will resolve the stability problem, it doesn't address the underlying security flaws of the NFS protocol. The NFS protocol itself has irreparable problems that demand it be replaced. This might be as simple as hoping for full NFSv4 support in Linux soon or as complicated as implementing a different file system infrastructure under Linux. This is, as yet, uncharted territory, so more news will come as we further research the issue.

Information Infrastructure: Communication is a major problem for the systems staff. As such, we're looking at better ways to communicate with students and to put a "human face" on what we do. If we can do this, students, staff, and faculty will be more likely to work with us rather than grumble about us. I have a few ideas that range from improving the support web site, to improving our support tracking system, to posting more information in the form of banners and posters in the labs, and even possibly requesting some renovations to Nichols to make the systems staff hallway more accessible.

I think lots of exciting things will happen during the coming year with the services. I hope we can prove our worth to faculty, staff, and students for the next coming year. Consider this an early New Year's resolution. Cheers.

1

Tags

Find recent content on the main index or look in the archives to find all content.