August 2006 Archives

One place where Drupal is week is document management. However, after doing a little digging, I've decided that Drupal is actually not so bad as it might seem at first.

What is a document management system (DMS)? Basically, it's a repository that allows you to upload and organize documents. It should include the ability to store document revisions so that older revisions can be reviewed and restored. It needs a workflow system so that various authors and individuals can collaborate. It needs an orderly way of organizing and fetching documents. It needs a permissions system for managing who has access to the documents. It needs to be able to store metadata about documents (log entries, categories, authors, bibliography, price structures, or anything else that might be useful as a sticky note on the document).

Drupal does several of these out of the box. There's a module for everything else. However, there are still a number of minor shortfalls that keep it from being very robust as a DMS. All of these could be addressed pretty easily. Anyway, this is my summary of what needs to happen to make Drupal act as a capable DMS.

So, as is, this is how I would set it up:

  1. Install Drupal.
  2. Install the Pathauto module.
  3. Install the Category module and turn on the menu and pathauto extensions.
  4. Install the CCK module.
  5. Turn on the built-in upload module.
  6. Create a new CCK content type named "document". Include any metadata fields you need.
  7. Configure the "document" content type allow file uploads and make sure that "Create a revision" is the default action. You probably want to uncheck "Promted to front page"
  8. Create a new container (i.e., the category module's replacement for vocabulary) named "documents". This should be hierarchical with single parents (at least to start). Select the "document" content type to require a setting in this hierarchy.
  9. Configure the Pathauto module so that document nodes are located at "Workflow plugin. I have not yet taken the opportunity to evaluate that module, so I don't know how it works or how well. Personally, I'd like to see a system like Relationship module---again, I have not evaluated this module and I'm not certain how well it works.

    There are a few weaknesses here.

    1. No Browser. There isn't a nice document browser. The category menu will work, but it's a bit of a kludge. It would be nice to have a configurable view that could show the hierarchy, show some metadata, etc. Preferably something with a nice AJAX interface.
    2. File Storage. This "DMS" will store document revisions just fine, but all of the documents will be stored in a single directory. This is probably fine until you start storying a few thousand entries. Unless your file system scales well in such a case (very few do), your storage is going to start slowing down. This isn't acceptable since a DMS should be able to store as many documents as you have disk space for without blinking.
    3. No search. What's a DMS without a decent search of the documents themselves? A likely solution to this would be to integrate a 3rd party search tool, like Nutch. There is, actually, a Nutch module for Drupal, so this might be another easy fix.
    4. No file typing. The system only vaguely understands different file types. It would be helpful if it really understood the difference between PDF and Word and OpenOffice documents and could perform specialized operations on each. Even just being able to identify them on a rudimentary level would be a start.

    Anyway, this is a pretty interesting problem. I find it very interesting that the various Drupal modules cooperate well enough together that this kind of enhancement is practically available as long as you glue everything together just so. I'm intrigued by the possibility of finding and using additional modules or improving or adding a new module to help fix the deficiencies I've noted here. I have a couple projects where it would be nice.

    Cheers.

    Update April 4, 2007: Please see visit the document management
    group on Drupal Groups to discuss this further.

Phew! This is getting nuts. I am more busy on web development at home than I am at work. Frankly, with this flurry of stuff going on at home, I'm starting to get a bit bored with work. Well, I do want to talk about something I did at work yesterday because it was pretty cool. I've put together a knowledge base and issue tracking system using Drupal and RT. It was pretty easy and I think others might benefit from at least some aspect of the solution.

We've been using Trac to keep track of our knowledge base, issues, milestones, and source browser surrounding the web development project. For the internal IT stuff, we've been using Microsoft SharePoint. Both have had limited success. Trac worked perfect until we needed a way for web site visitors to send in support requests via email. Trac's support for this is total crud. It's still being worked upon, but after having used RT in CIS, I'm pretty hard to please when it comes to support tracker email integration. Sharepoint has worked about as well as it ever does: good enough, but it's a bit tiresome to use. If we were using the beta of the new version, it would probably be fine, but that's not trivial upgrade to perform without causing massive alterations to lots of other services we run.

Okay, so I've started to put together something that could potentially replace both of these with flying colors. It's still a work in progress, but I think I've already got something better after only a couple days of working on the problem.

First, I installed RT 3.4. After installing it and getting Sendmail configured on our hosted server, I got RT configured. I leave learning how to do this as punishment for the reader.

Then, I went in and installed Drupal 4.7 and started configuring it. First thing to do is to install all the modules. I've installed the following:

  • Category. This is one of my new favorites. It is a drop in replacement for the Taxonomy module that makes every vocabulary and every term into a node. You can even use arbitrary content types as vocabularies and terms and can have vocabularies within vocabularies. It's very nice. Oh, and you can set the default item depth of a category so that it automatically aggregates children to a certain depth. Also very nice.
  • Interwiki. This is kind of the best available for the functionality I'm looking for, but it needs a bit of work to get it right. Basically, you can register prefixes with it that will then be translated into URLs using a wiki-like syntax. For example, it comes configured with the prefix "w" automatically linking to Wikipedia articles. Thus, you can insert Nichols Hall and it would link off to Nichols Hall at Wikipedia. You can add other prefixes to other sites or use no prefix, i.e., foo or //drupal.org/project/pathauto">Pathauto. This ought to be a standard Drupal module. It generates paths for a node according to settings you set. I use it heavily on this site now and it's what allows us to have nice wiki-like articles on the new knowledge base.
  • TinyMCE. This is a must have if you have non-technical or semi-technical staff helping you write articles. We have a content editor, Doug who understands but doesn't really converse in HTML on a regular basis. There's no need to make him when TinyMCE exists and is so easy to install and use.

That's it. I installed all of those and then went through the tedious process of setting up access controls, updating settings, etc. Now, I can create a page on the site titled "You are a foo foo" and then create another article and link to it by adding the text [:You are a foo foo
. I can use TinyMCE to format the text without worrying about learning the Wiki-syntax-flavor-of-the-month and so I won't be too afraid to show someone completely non-technical how to enter knowledge base information later on.

I added new custom interwiki prefixes for getting at RT and Trac tickets and the old Trac knowledge base articles as well for the transition period. Now, I can link to our old standards document via BoomerStandards
or to a ticket in the old system via 435
or to a ticket in the new system via 219
. I added a number of other shortcuts as well and this is really the only wiki-syntax anyone needs to know about and this is a pretty easy one to remember.

The knowledge base is now in place. Let's go back to RT. To RT, I've now added a callback to the system for the ShowMessageStanza ticket element in the RT system. That is, I created a file named "Callbacks/boomer/Ticket/Elements/ShowMessageStanza/Default" in the local hacks directory for RT. You may wish to see CustomizingWithCallbacks for more information on creating customizations using the RT callback system. Anyway, the contents of this callback is as follows:

<%perl>
$$content =~ s{(?<!&)\#(\d+)}
{<a href="${RT::WebURL}Ticket/Display.html?id=$1">#$1</a>}gx;

my %config = (
tracwiki => 'http://trac.example.com/trac/foo/wiki/$1',
trac => 'http://trac.example.com/trac/foo/ticket/$1',
rt => $RT::WebURL.'Ticket/Display.html?id=$1',
http => 'http:$1',
https => 'https:$1',
'' => 'http://supportkb.example.com/$1',
);

$$content =~
s{
(!|\\)? # stop now if there's a ! in front
(
\\
\|]*): # match a prefix like abc:, remember the prefix
(([^\
]+))? # match a pretty name like "Blah Blah is cool."
\] # match closing ]
)
}{
my ($escape, $match, $prefix, $link, $title)
= ($1, $2, $3, $4, $5);

if (defined $escape) {
qq($match);
}
elsif (defined $config{$prefix}) {
my $url = $config{$prefix};

my $term1 = $link;
$term1 =~ s/\ /_/gx;

my $term2 = $link;
$term2 =~ s/\ /\+/gx;

my $term3 = $link;
$term3 =~ s/\ /%20/gx;

my $term4 = $link;
$term4 =~ s/\ /-/gx;

$url =~ s/\$1/$term1/gx;
$url =~ s/\$2/$term2/gx;
$url =~ s/\$3/$term3/gx;
$url =~ s/\$4/$term4/gx;

$title = defined $title ? $title
: $link =~ /^ "$prefix:$link";

qq(<a href="$url">$title</a>);
}

else {
qq($match);
}
}gex;


</%perl>

<%args>
$content
</%args>

If you're unfamiliar with Perl and Mason, you may have a migraine by now. You might have one anyway, unless you're a JAPH like me.

Anyway, this replicates part of the functionality of interwiki within RT messages by converting the [rt:123
sequences into links. This also features the ability to create links between tickets using a #123 syntax as well, which I'd like to add to the knowledge base eventually as well because I find the Trac-style linking of #123 (tickets), @123 and r123 (revisions), and [123] (changesets) to be very easy to remember and very handy.

That gets me much of the way to an integrated Drupal/RT knoweldge base/issue tracker system. I still need more though, but I haven't gotten there yet. Stuff I'd like to add in:

  • Organic Groups. This would allow us to have a subsite for each knowledge base: one for internal IT, one for web development. Later, we could add one for sales information, one for content style guides, or whatever.
  • Fix/replace interwiki. The interwiki module is nice, but isn't quite nice enough. It would be nice to have additional formats available. With very little effort, I should be able to create a custom filter to either extend this functionality or outright replace it just for our needs.
  • Single Sign-on. I have some experience with this lately. If I can get it so that I only sign in once to Drupal and or RT and then I'm in both, that would be great. If I could implement it using an independent LDAP server, all the better.
  • Configuring RT. Configuring the interwiki-ish filter for RT is done in code. This isn't ideal. With whatever solution I work on for Drupal, it would be nice if the RT filter could load it's configuration from the same place, or at least get it from a config file and not in the code itself.

Will I get all this done? Probably not, but I'll get some of it done and if I think it's interesting, I'll try to share it with you.

Cheers.

One of the features I'm looking into the improved New Hope Church web site is Organic Groups. Organic Groups is a module for Drupal that allows for the creation of sub-communities in a social networking site. Every social network is made up of a larger collection of individuals that all have a collective goal or interest they are pursuing. However, each individual has their own unique abilities (er, maybe that should be Unique Ability¿, oh wait, this is my blog, not work ;), interests, and agenda. Generally, the individuals will sort themselves into a clique within the larger community that allows them to serve and be served in a way that fits. Organic Groups is the Drupal-way of facilitating this.

How would this be useful to New Hope Church? After installing the module, I would grant each person with the Ministry Leader or Staff Member role the ability to create groups. This includes everyone who runs a LIFE group in our church. Each leader could then create a group for the group or groups they lead. Then, each member of that group can subscribe to the group and gain access to the group discussion. Each group could have a forum for discussions, a calendar of events, documents such as current lesson notes, blogs, prayer requests, audio messages, image galleries, mailing lists, etc. Anything the main site can have, we can allow each group to have within itself. Group data could be made public for anyone to read or private just for subscribed group members and subscription can require approval by the group leader. That's a lot of flexibility.

Okay, sounds great. However, there are a few caveats. This will require work to setup and probably some amount of maintenance. Currently, I'm only one guy with some help from a couple others. However, if this became popular, I'm betting I will have at least one or two other folks interested in helping out. Another caveat is that not everyone will want to participate in the online part of the site. Some may not want to visit the site and some groups, accountability groups, for example, might need a level of privacy that prevents this from being useful to them. That's fine. They don't have to use it if they don't want to, but that doesn't need to limit the others.

The last issue is that I'm basing all of this on my reading of the Organic Group documentation. I've found that implementation, invariably, differs from documentation. Therefore, I suggest that it may not be quite as flexible or cool as I've described, but the general idea is definitely in place and the majority of it does work as I expect (I have demoed parts of the functionality).

Anyway, that's my vision for how to increase LIFE group communications online. This idea was never as fully formed as this, but it has been the general vision from the beginning to facilitate communication and community through the web site from the beginning of this project in 2003. I think we're very close to making this a reality now. I'm excited.

Cheers.

Okay, so I've been doing quite a bit of development on this site and my church web site at home. When in Linux, I always use Firefox. It's convenient and has all the tools I need for web development, which is nearly all I do at work. On Linux, it's also pretty solid---not rock solid, but I use a heavy load of plugins.

However, at home, I do a combination of blogging, reading email, sorting photos, Bible study, etc. in addition to the home development projects I work on. When I'm browsing the web for anything other than development, even research for development, I nearly always run Safari. However, when working on web development, I nearly always use Firefox (except to test how things look in Safari. Why?

On Linux, Firefox is pretty solid. On Mac OS X Tiger, it's rock solid, but it bogs down and eats up the memory of my poor little Mac Mini if I do much with it. When I'm doing development, I usually try to close out everything I don't need and cope with the shortness of memory (which is really the problem since my Mini isn't fully upgraded in that arena). Safari, on the other hand, is rock solid and lightning fast all the time. It doesn't bog down, but it doesn't have all the tools I use in Firefox that make web development go so well. Anyway, I use both very heavily and flip between them pretty regularly.

The moral? I don't really have one, I'm just saying that's how I work and since I'm in blogaholic mode, I felt like a writeup on the topic. However, while I'm at it, I will recount my absolute favorite Firefox extensions, which are the reason why I use it for development. If you read any of the following be sure to read the first and last bullets!

  • Web Developer. This is a must have for any web developer. It places a toolbar menu in Firefox which allows for quickly testing forms, discovering form details, live editting of CSS, turning on/off plugins, clearing authentication data, and learning other information about a page.
  • Colorzilla. This is a prerequisite for web design. It gives you an eyedropper you can use on a page to figure out what any color is. I've had a few problems with it over the last few months claiming that my browser wasn't supported (when it worked before, bah), but that seems to have been fixed now.
  • Aardvark. Ever want to know which parts of the page are in which boxes and what tag that is or what the class is? Once started, it visually highlights and names which box your mouse is hovering over. I use this in situations when I just want to check real quick because there are a couple other options available for digging deeper.
  • Google Toolbar. This isn't explicitly for development, but it does make getting to my favorite search engine a lot faster.
  • Venkman. This tool is overkill for most the problems I face. This is a full debugger for JavaScript in which you can set breakpoints and really get into the guts of the JavaScript engine. I suspect this would be very useful for anyone building Firefox extensions, but it really is too much for debugging nearly any web page issue I've faced. I still have it installed, just in case.
  • Google Browser Sync. Since I use many of the same resources at home and at work, this tool is very useful. It synchronizes my bookmarks and other data between computers and even allows me to resume a previous session if my browser crashes or if I just want to pick up where I left off when I move from the kitchen computer to downstairs.
  • Firebug. This is the coolest plugin ever! It provides a nifty debug console that you can use with JavaScript by just inserting console.log() statements to your scripts. It shows a status bar widget that shows how many JavaScript or CSS errors have occurred on the current page since it was loaded. It contains a basic debugger that is much simpler and has just the right level of functionality for trying to work out the harder script errors. Finally, it has a very detailed inspector that has an Aardvark-like inspection feature which outlines boxes in the top pane while showing the tags in the parsed source in the bottom pane. This shows the DOM tree as it exists after your JavaScript has modified it as well, which is super helpful. You can drill down pretty far.

Okay, so I'm announcing here the first mini-launch of the New Hope Church web site. The site is now running on a later version of Drupal and should have most of the old features intact, with a very few additions. I need to get the login information to Eric so he can start on the theming now so we can have the full launch.

I'm most excited right now about the new features that are coming rather than the few I've added. I'm going to talk about both and the features we already had as a reminder.

Existing Features

The New Hope Church web site has featured these capabilities to date:

Blogging
Any staff member or ministry leader in the system has had the ability to create a blog within the site. However, a very limited set of members have been given the appropriate permissions and of those, they generally have blogs elsewhere or not at all.
Comments
Any authenticated visitor to the site may add comments to most of the stories, audio messages, and events on the site. Again, however, this has never been used. It has never been advertised and I don't know if anyone has noticed the comment links since the current site design works to hide them.
Contact Forms
Each staff member has a contact form on the site. This was the only available way of contacting staff members during the previous iteration because I never implemented a better way. The main problem is that the contact forms require registration to work.
Events
This was one of the most heavily used features. The only serious issue with this was that recurring events weren't possible, so we couldn't add events like the Fusion youth meetings or Sunday gatherings unless we wanted to add each and every occurrence. I don't think anyone on staff has that much time to spare.
Messages
This is the single most used feature of the site. We upload audio to the site and then visitors can read notes about the sermon, download the audio, and listen to Podcasts.
Map
There is a map on the site showing the location of the old church office, Flint Hills Christian School where we meet for on Sunday, and the location of our land. Unforunately, an IE bug I documented a while back keeps that from working properly for anyone using IE.
Notifications
Staff, ministry leaders, and administrators (i.e., me) can setup email notifications when the site changes. I originally use this to watch for when Tony uploads the audio and to check back to make sure things were working. However, I confess that I've been simply deleting the notifications lately.
Pages
Various informational pages have been placed on the site. So far, this isn't much used, but I think that we'll see the amount of information on the site increase over time.
Biographies
Each staff member has a bio on the site associated with their user page. Any member of the site may also have a biography. More information could be added to the member profiles in the future, but I have no direction on what kind of information folks would like to see.
Search
The Drupal search abilities were a little pathetic in 4.5 and 4.6, but it looks like things are quite a bit better now in 4.7.
Announcements
In addition to events, some announcements don't have a specific date associated with them. These are heavily used in the current system and this should continue.
Categorization
Currently, categorization is only used to differentiate between which announcements and events belong to which ministries. With this update I've replaced the original categorization system with a different system that I think will serve us better.
Throttling
The site does have some throttling capabilities. That is, when it starts to see heavy loads, the site will start disabling some features of the site to keep it from being too overloaded. I've not configured this very well yet, so it probably needs to be looked at some more.
TinyMCE
No one visiting the site needs to know HTML. All pages can be editted using an editor with a Word-like interface.
Upload
Staff and ministry leaders can upload files to the site to associate them with stories, pages, events, etc. Unfortunately, our hosting service limits uploads to under 10MB, so this doesn't always serve us very well.
URL Filter
All documents on the site will automatically convert unlinked URLs and email addresses into links. This is also meant to simplify how the site is used for the non-tech-savvy.

Okay, that's what we already had. Now, on to what has been added.

New Features

Here's the list of new features that I've added with this update of the software running the web site.

Categories
I've switched the system from the "taxonomy" plugin to the "categories" plugin. One of the main reasons Drupal is popular is it's "taxonomy" system which provides a very rich language for describing metadata. Metadata, in this case, is basically just extra keywords associated with any particular page on the site.

For example, an announcement might have a taxonomy term associated with it named "Announcement" in the "Section" vocabulary. It might also have a term of "LIFE Groups" associated with it in the "Ministry" vocabulary. Each term is a particular keyword and each vocabulary is just a set of related keywords. The taxonomy system is very nice, but it has a shortcoming in that you can't describe the terms themselves very well. For example, what are "LIFE Groups"? What is a "Ministry"? These might be legitimate questions we'd like to answer and the taxonomy provides only a very rudimentary solution to this problem. I used some weak solutions in the previous version of Drupal to address this problem, but they didn't really work very well.

With the latest version of Drupal there's a new module called, "Categories" that promotes each vocabulary to a first-class citizen called a container and each term to a first-class citizen called a category. Each of these have the same full power as a regular web page and can contain a full description of what they are. This is a cool enough feature, that I will probably start using it on this web site soon. Kudos to Greenash for pioneering this.


Repeating Events

This resolves the old problem of not being able to have recurring events on the calendar. We can now create an event for Sunday morning worship and then make it a recurring event that repeats to a certain date. Handy.

Feedback

In addition to the Contact forms already in place. I've now added three new feedback forms that do not require creating an account to use. This will surely mean a little more spam for those of us on the receiving end, but that's not easy to avoid. There's now a form that is kind of an online "white card" like we pass out in church, a form for prayer requests, and a form for site problems.

Google Analytics

The chuch site has started collecting more statistics about visitors to help us improve the content of the site. I used Google Analytics because it's free, though, we've had some worries about it's accuracy at work. I'm hoping it will be better than nothing.

Just in case, I have also added a couple other stats modules to Drupal as well, to use Drupal's built-in statistic gathering capabilities to track things directly from the site too.


Google Sitemap

The site will start generating a Google sitemap, which should help improve how our search results show up in the most popular search engine.

Menu

This is a de facto improvement that comes just with upgrading 4.7, but we can now create much nicer menus in the system than we could before.

Automatic Path names

No more "node/123" links! I've added this in to automatically create easier to read and link paths within the system. Anytime someone joins our site, they're user profile will be at "user/user_name" rather than "user/123". Similarly, whenever an event is created, it will be at "calendar/2006/08/13/some_event_name". There are still a few irritations to work out of the names being generated, but I'm much happier with the way this is working already. By the way, the old links still work too, so they won't be broken either.

That's what's already new. Now on to the stuff that I think's going to be freakin' cool.

Upcoming Features

Here's my wishlist:

Google Map Improvements/Location/Geocoding
I've already started playing with this a bit. One thing a lot of folks have asked for is an online directory. I've been thinking about how to do this and I think I can provide an online directory for members that will not compromise anyone's privacy.

In addition to providing a directory, I will be able to provide a map showing the map location of all our members on a single map. So, you'll be able to see how close you live to fellow New Hope members. This will probably include a more public way of publishing LIFE group and event locations as well.


Forums

I would like to add forums to the site. Drupal 4.7 now includes the ability to create a pretty decent forums system directly in site. I think we could have various forums for things, particularly site suggestions and problems. This will be combined with my next item...

Organic Groups

This is a feature added by the CivicSpace crowd to allow members of a Drupal site to create groups on an ad hoc basis. With some controls on the system, we could allow staff members and ministry leaders to create mini-sites in the system with their own stories, events, audio messages, forums, etc. I see this as being a really exciting feature for the web site with the potential to revolutionize the communications abilities of thte staff and other leaders.

Okay, that's not the full list of what I want to do, but that's plenty long enough. It will probably take some weeks before any of the latter bits are implemented and I'm still looking for and waiting to hear bug reports on this latest update.

Cheers.

I'm going to make this quick because Terri needs a counter-weight in bed (we have a waterbed) and I've already wasted much time responding to Slithy on Brent's blog on the history of sexual promiscuity in western society.

I've started the module update and have gotten most of the replacement modules installed that can be installed. However, I've had to get a replacement for the taxonomy_assoc module (as I thought I probably would), since that module has been discontinuied in favor of the Category module.

The Category module is a drop-in replacement for Taxonomy that replaces vocabularies with "containers" and terms with "categories". The difference is that both of these are now nodes. Not only this, but any content type can be treated as a container or category, effectively giving you a node for every term. Unfortunately, it's not quite as flexible in some ways as it doesn't seem to have a "folksonomy". Though, I haven't yet looked at it far enough. If it does have tagging or it can be added, I will probably migrate this site to it as well.

This module also replaces the Book module, but I haven't really looked into that since we don't use the Book module for anything at New Hope yet, though I think we might after talking to Dan and Tony a couple weeks ago.

Other than that, I've also added, but haven't tested the eventrepeat module. I've updated the various other modules that are available that I mentioned previously. I also need to see if Tony wants me to do anything to add image galleries, forums, and some other pieces. Finally, I think things are now to the point that Eric can start working on the design whenever he feels like it.

Cheers.

As I said earlier this week, I want to write about how I implemented single sign-on for the new Boomer.com web site. I didn't come up with the design on my own, but the implementation is from scratch and unique--i.e., proprietary. In the future, I hope to replace the system with something else but since certain aspects of our system's integration or currently in limbo and the lack of prefab Magnolia-to-Drupal implementations, a quick and dirty implementation was certainly in order. I like the results, as much as I can for a quickie, which is why I share it here.

Design

As I just mentioned, I did not invent this design myself. I merely implemented the pattern described for Cosign, a project at the University of Michigan. If my deadline hadn't been looming so close, I might have considered implementing a Drupal plugin for Cosign. As it were, though, I created a fresh implementation that has no other relationship to Cosign than the fact that I have approximately the same structure of HTTP requests and responses that it has (and probably several other implementations).

For the following use-case, I will call the web site that holds authoritative information the Auth-Site and the other site requring authentication information against the Auth-Site, the Sub-Site.

Basically, the use-case works like this:

  1. The Client attempts to connect to a protected document on the Sub-Site.
  2. The Sub-Site server returns a Sub-Site session cookie to the Client and redirects the Client to the Login-Checker of the Auth-Site.
  3. The Login-Checker of the Auth-Site determines if the Client has logged in already to the Auth-Site. Assuming that the Client has not, the Login-Checker returns an Auth-Site session cookie to the Client and redirects the Client to the Login-Form.
  4. The Client then fills out the Login-Form and returns it to the Auth-Site, which validates the login using the Login-Form-Checker. Assuming success, the Login-Form-Checker notes that the original Auth-Site session cookie came from the Sub-Site. It then performs a redirect to the original protected document on the Sub-Site that includes in the URL a Login-Token.
  5. The Client then returns to request the original protected document from the Sub-Site, but with the Login-Token this time. The Sub-Site then performs an out-of-band (direct connection from Sub-Site to Auth-Site) to see if the Login-Token is valid. Assuming the Login-Token is valid, the Auth-Site returns the Client profile to the Sub-Site and the Sub-Site returns the protected document.

That probably sounds pretty complicated, but it basically amounts to the Sub-Site deferring login to the Auth-Site and asking the Auth-Site for a Login-Token to validate the user by. The major hiccup is the risk of session hijacking if an attacker happens to guess the token.

If that's still confusing, the Cosign site has a lovely sequence diagram of the interaction.

Implementation

So the actual implementation required two additional Java servlets to run with the Magnolia server (in addition to the login servlet already in place). One I called the CheckForLoginServlet. This servlet checks to see if the visiting user has already logged in. If the user has, the user is redirected from whence she came with the login token to verify her authenticity. If the user is new, the user is then redirected to the login form, which is notified of which URL the user needs to be returned to afterward.

The second servlet is the ValidateAuthTokenServlet. This servlet handles the out-of-band, server-to-server communication, which verfies the user and sends an XMLized version of the user's profile from Magnolia auth-site to the Drupal sub-site. The token itself is a UUID, which should be sufficiently difficult to guess, but might still be vulnerable to snooping attacks, since the communications aren't currently encrypted. Really, though, this is no worse than any other plain text login, which has the same vulnerabilities. We plan to move the authenticated parts of the site to SSL in the next few months to make the system stronger.

Another interesting piece to the puzzle is that I made the LoginToken objects semi-autonomous in that they erase the persistent data they store with the user objects when the session they belong to is invalidated. A user can login multiple times from differnet locations and each location will have a unique session token. When the user's session times out, the token is also invalidated.

On the Drupal side of things, I had to add a couple little hacks to get things to work just so. First, since the Drupal site is entirely contained within the Extranet, all documents are protected documents. No one should see anything on the site without logging in first. Thus, I overrode hook_init to perform a redirect to the CheckForLoginServlet unless the request came with a login token, came from a browser with an already validated session, or was coming for the cron script---which I didn't want to authenticate to execute.

If the request comes with a login token, the Drupal server performs the out-of-band check against ValidateAuthTokenServlet to see if that login token matches any current user session. If it doesn't, the user is kicked out to the CheckForLoginServlet to login again. If it does match, the user is, effectively, logged in by a special subroutine which loads their profile information from the XML passed back by the ValidateAuthTokenServlet. This also updates all the roles the user is in to make sure his permissions are correct and then drops them into the page.

All-in-all, the visitors should be relatively oblivious to the system. The only hiccup at this point that I need to take care of is to make sure that anytime the query string of a request to Drupal or to Magnolia contains a login token, that the client be immediately redirected again to remove the token from the location bar. Those tokens shouldn't accidentally end up in linked URLs if the user copies-and-pastes, or someone may find themselves logged in as the other user---i.e., bad stuff.

In the long-term, however, I will probably implement something like Cosign or CAS or SXIP, depending on how things shake out over the next few months. If we choose the right standard, we should be able to provide better integration with clients or third party apps to improve our SaaS opportunities in the future.

I've started the software update for the New Hope web site. I've got the main Drupal system updated, but I haven't started on the modules. I'm doing the updates to a copy of the site so the current site can still run in the meantime (so don't bother looking for changes as they won't show up until we're done).

The trickiest bit is going to be that some of the modules we use on the current site are used because Drupal was lacking certain features that it no longer lacks. The other tricky bit will be the modules that are being discontinued in favor of better solutions--i.e., I'll have to do some conversions.

Here's the list of modules that need to be updated:

  • event. I also plan to install eventrepeat as part of this process so that we can start showing recurring events, like Sunday mornings and youth meetings.
  • flexinode. I'm going to replace this module with the content construction kit if I can. If it will be too painful, I may leave this for an update in a month or two.
  • form_mail. I don't know what to do with this. I don't know what is available for this functionality.
  • forms. Ditto.
  • notify. This will either be an upgrade or I'll use a better notification module (not that there are many good choices for this functionality, according to my recent research).
  • sermon_customizations. I hacked in some functionality to make Podcasts and some other bits to work correctly. Most of this can probably be ripped back out.
  • taxonomy_assoc. This module associates nodes with vocabularies and terms. This is used in some important places, so I'll have to see what to do about this one (or just update, if possible).
  • taxonomy_context. I can't honestly remember what this does at the moment, but I think this is a ditto of the last comment.
  • theme_editor. I'll have to find out from Eric if this is still needed. I may just turn it off.
  • tiny_mce. This is important. This makes it possible for our staff to post to the site without knowing any HTML or special document formats. I'll probably need to discuss with other stakeholders what features to include in the future too. There has been some disagreement about which functionality should be available to authors.
  • urlfilter. This is used to turn http://... strings into links when someone doesn't link a URL. This is probably a trivial update.

The last bit that I'll need to update is making sure the script I've written to allow Tony to upload recorded audio also gets a face lift. It'll surely need a few changes. I wrote this script to get around upload restrictions set on PHP code on our host and to help standardize the MP3 tags in the audio for the Podcasts.

That pretty well summarizes my plans for the next launch. Eric is going to be responsible for importing the design that Jay Risner has put together for New Hope. I hope we can get the first release knocked out by the first week of September. I'd like it to be before the students return, but that may not be realistic given the data conversion I'll have to perform and some of the remaining design questions Eric has. We'll get it done.

Cheers.

My web site was never so popular as it were when it were Drupal. Now I'm back to the glory days. Well, maybe. I lost a lot of prestige in the months that I quite maintaining my blog very well while I was attempting to build up Contentment. It also hurts that the paths I had have changed and broke and are incorrect. However, I now have better paths due to my friend, pathauto.

I've now finished importing all the content from my previous WordPress and Drupal sites. I did the blog posts by hand because many of them needed editting and I took it as an opportunity to remember the olden days. There's still more work to be done if I'm really going to get the archives correct since some links have changed and some of the text encoding is screwed up in a few places still, but it may just stand as is.

I've been thinking of bringing in my Blosxom posts as well and even thought of going back to the stuff I built in Everything and back in the static HTML days of yore, but if that happens it probably won't be this month. ;) I'm tired of imports, but I may end up doing it.

One of the coolest new custom features I built this time around was the Categories block to the side there. The most exciting aspect of that is how I order the categories so that both quantity and how recently the tag was used. I built a corresponding Popular Tags page describing how the process works containing a complete list and shows individual scores. It's not perfect, but it's relatively nice.

I still need to tweak the design more. Comments look terrible and I'm not very fond the sidebars. They should match the curviness of the banner. I also need to tweak margins because there's not enough whitespace. However, I wanted to launch as soon as possible because it's a drag having a sucky blog...at least for me.

Cheers.

Popular Tags

|

This is the list of all tags I've used in my blog to those point (this includes or will include tags used in content aggregated from other sites as well).

Categories

$container_id = 181;
$sql =
"SELECT d.nid, d.title, c.description, ".
" MAX(n.created) AS updated, ".
" COUNT(*) AS count, ".
" SUM(2.5/LOG(0.25*((UNIX_TIMESTAMP()-n.created)/2592000)+1.5)-1) AS score ".
"FROM {node} d ".
" INNER JOIN {category} c ON c.cid = d.nid ".
" INNER JOIN {category_node} cn ON cn.cid = c.cid ".
" INNER JOIN {node} n ON n.nid = cn.nid ".
"WHERE c.cnid = %d AND n.status = 1 ".
"GROUP BY d.nid, d.title, c.description ".
"ORDER BY score DESC";

$count_sql = "SELECT COUNT(*) FROM {category} c WHERE c.cnid = $container_id";

$result = pager_query($sql, 30, 0, $count_sql, $container_id);

while ($category = db_fetch_object($result)) {
$items[] = array(
l($category->title, 'node/'. $category->nid,
array('title' => $category->description)),
$category->count,
round($category->score, 1),
t('%time ago', array('%time' => format_interval(time() - $category->updated, 3))),
);
}

print theme('table', array('tag', 'count', 'score', 'last update'), $items);
print theme('pager');
?>

How it works

The list is sorted in order of popularity using an inverse-logarithm scoring method I developed myself. Basically, the most recently used terms get a very high score that tapers off to a relatively low score once the post is a year old or older. Each use of a term is cumulative so the more commonly used terms will appear higher on the list even if they haven't been used as recently as others. This list will sort itself as time goes on according to whatever my latest posts are about.

Originally, when I used the built-in taxonomy module of Drupal, I used this SQL query to do the calculation:

SELECT d.tid, d.name, d.description, 
    MAX(n.created) AS updated, 
    COUNT(*) AS count, 
    SUM(2.5/LOG(0.25*((UNIX_TIMESTAMP() 
        - n.created)/2592000)+1.5)-1) AS score 
FROM {term_data} d 
    INNER JOIN term_node USING (tid) 
    INNER JOIN node n USING (nid) 
WHERE d.vid = ? 
    AND n.status = 1 
GROUP BY d.tid, d.name, d.description 
ORDER BY score DESC

However, since I recently switched to the Category module, I updated the SQL to reflect this with:

SELECT d.nid, d.title, c.description,
       MAX(n.created) AS updated,
       COUNT(*) AS count,
       SUM(2.5/LOG(0.25*((UNIX_TIMESTAMP()-n.created)/2592000)+1.5)-1) AS score
FROM {node} d
     INNER JOIN {category} c ON c.cid = d.nid
     INNER JOIN {category_node} cn ON cn.cid = c.cid
     INNER JOIN {node} n ON n.nid = cn.nid
WHERE c.cnid = ? AND n.status = 1
GROUP BY d.nid, d.title, c.description
ORDER BY score DESC

Here's the function describing the math involved. I developed this function for calculating the score by arbitrarily modifying the log curve to suit my needs.

s equals the sum iterating over eye from one to en of two-point-five over one-point-five plus em log zero-point-two-five quantity minus one

In the equation, s represents the final score, n is the number of terms and i is the iterator over the terms. The mi represents the number of months since the creation of the post the ith use of the term belongs to. Since Drupal stores time in seconds since the epoch (i.e., January 1, 1970), the current time is calculated by subtracting the node's creation time from the current time and then dividing by 2,592,000, which is the number of seconds in a month.

If you view a curve plot for an individual iteration, you would note that when the delta (mi) is 0, the score for that term will be around 5.2. When the delta is 12, the score is about 1.

Ooblech. I'm tired. I better get this posted and then get some sleep.

I'm excited to announce the two new site launches: this web site and a new Boomer.com. The latter has involved 6 months of my blood, sweat, and late hours. The former took me all of today to revamp and re-re-re-re-re-release. I'm also working on launching about three other web sites in the near future.

Home Site

As I stated previously, I'm going to start toning down the development I do at home. Well, at least that was the plan. It didn't really work as I thought. I've probably spent almost as much time in front of a screen at home as at work over the last few months. The reason is related to some work I've been doing with Perl and Java---which, incidentally, might be published on Perl.com in the next few months.

I've been toying with the idea of using a Java-based database engine called Jackrabbit with a Perl web front-end called Catalyst. While that was a fun project to work on, the results weren't quite as nice as I wanted. Rather than continue tweaking it into what I want (which would probably only take another month or two of effort), I'm going to take what I've learned and move on to using Drupal again instead.

Why back to Drupal? Well, I was using WordPress, which is a pretty decent blog app. However, it's just not as easy to hack as Drupal is and I know that if I have a blog, I'll still want to tweak it, but that's not so easy with WordPress. Furthermore, I'm webmaster for the extranet site at work, which is Drupal, and our church web site, which is Drupal. I figure I can maximize my ability to maintain all these Drupals. Drupal is an excellent general purpose CMS even if it's a little bit of overkill for a blog.

I'm also starting to contribute to a number of different web sites and I'd like to have one central location to collate all those contributions and I hope to make this the place. With some enhancements, I think either the built-in or one of the other aggregation modules available for Drupal will come in handy for this. I also have a kiddo on the way and plan to share photos here. That is to say, this isn't going to be just a blog for long.

Finally, as I mentioned I did learn something from the Perl/Jackrabbit integration and I plan to use some of that to improve the user interface of Magnolia (see below) and/or Drupal. I'll probably write a blog on what I learned in the next few days so that these thoughts aren't quite so nebulous.

Boomer.com

Boomer.com is another interesting beast. I'm still trying to decide how I feel about what we've done. It looks pretty good. Though, the look will never look as good as either Eric or I want it to. Being so close to the project, I know all of its blemishes. Like home remodeling projects, when you fix something, there are inevitibly a few mistakes that bug you, but when you look at project by someone else, you never even notice them. It's the same way here: I will always know what isn't quite as nice as it could be even though almost no one else will notice.

As for the technology, we've actually built a hybrid of Magnolia and Drupal. That's mostly interesting because Magnolia is written in Java and requires a Java-based application server, while Drupal is PHP based and requires CGI or an Apache module to run. There are some tools to help PHP and Java communicate server-to-server, but we haven't used any of them, mostly because we couldn't get them to work correctly within our timeline.

However, I'm pretty happy with how well the integration works. The most interesting project related to this so far has been the implementation of single sign-on, which required some interesting redirect tricks. It works pretty well and I'm confident that our customers will be completely oblivious to the nuts and bolts of the process. I'll probably blog on the single sign-on process in another blog coming up, so look for it! ;)

Just because we launched, though, doesn't mean we're finished. There's a long list of new features to add. Since I'm not sure how much of that list is safe to tell, I'm going to just say that we hope to sell more stuff and hope to add more multimedia and hope to put up some additional web-based applications to facilitate our consulting; that ought to vague enough. :P

Other Sites

I'm also working on a few other projects. A couple of them I don't really want to talk about except to say that I'm working on them. ;) However, our church web site is one of the biggies I'm currently working on. This is another Drupal site. First, the web site is currently running an old version of Drupal (either 4.5 or 4.6, I don't remember which and 4.7 is current). Second, the church has been redesigning the look of our logo, letterhead, signage, and web site. The web design has been chosen and Eric and I will be putting that together as soon as we can.

Anyway, I'm excited about that and the fact that the church will be starting a monthly newsletter to help drive new traffic to the site. I hope this will improve communication for our members, which has been a historical problem of our church (as it is with so many).

That's what I'm working on. As I said, I'm tired and now it's time to sign off.

Cheers.

Ooblech. I'm tired. I better get this posted and then get some sleep.

I'm excited to announce the two new site launches: this web site and a new Boomer.com. The latter has involved 6 months of my blood, sweat, and late hours. The former took me all of today to revamp and re-re-re-re-re-release. I'm also working on launching about three other web sites in the near future.

Home Site

As I stated previously, I'm going to start toning down the development I do at home. Well, at least that was the plan. It didn't really work as I thought. I've probably spent almost as much time in front of a screen at home as at work over the last few months. The reason is related to some work I've been doing with Perl and Java---which, incidentally, might be published on Perl.com in the next few months.

I've been toying with the idea of using a Java-based database engine called Jackrabbit with a Perl web front-end called Catalyst. While that was a fun project to work on, the results weren't quite as nice as I wanted. Rather than continue tweaking it into what I want (which would probably only take another month or two of effort), I'm going to take what I've learned and move on to using Drupal again instead.

Why back to Drupal? Well, I was using WordPress, which is a pretty decent blog app. However, it's just not as easy to hack as Drupal is and I know that if I have a blog, I'll still want to tweak it, but that's not so easy with WordPress. Furthermore, I'm webmaster for the extranet site at work, which is Drupal, and our church web site, which is Drupal. I figure I can maximize my ability to maintain all these Drupals. Drupal is an excellent general purpose CMS even if it's a little bit of overkill for a blog.

I'm also starting to contribute to a number of different web sites and I'd like to have one central location to collate all those contributions and I hope to make this the place. With some enhancements, I think either the built-in or one of the other aggregation modules available for Drupal will come in handy for this. I also have a kiddo on the way and plan to share photos here. That is to say, this isn't going to be just a blog for long.

Finally, as I mentioned I did learn something from the Perl/Jackrabbit integration and I plan to use some of that to improve the user interface of Magnolia (see below) and/or Drupal. I'll probably write a blog on what I learned in the next few days so that these thoughts aren't quite so nebulous.

Boomer.com

Boomer.com is another interesting beast. I'm still trying to decide how I feel about what we've done. It looks pretty good. Though, the look will never look as good as either Eric or I want it to. Being so close to the project, I know all of its blemishes. Like home remodeling projects, when you fix something, there are inevitibly a few mistakes that bug you, but when you look at project by someone else, you never even notice them. It's the same way here: I will always know what isn't quite as nice as it could be even though almost no one else will notice.

As for the technology, we've actually built a hybrid of Magnolia and Drupal. That's mostly interesting because Magnolia is written in Java and requires a Java-based application server, while Drupal is PHP based and requires CGI or an Apache module to run. There are some tools to help PHP and Java communicate server-to-server, but we haven't used any of them, mostly because we couldn't get them to work correctly within our timeline.

However, I'm pretty happy with how well the integration works. The most interesting project related to this so far has been the implementation of single sign-on, which required some interesting redirect tricks. It works pretty well and I'm confident that our customers will be completely oblivious to the nuts and bolts of the process. I'll probably blog on the single sign-on process in another blog coming up, so look for it! ;)

Just because we launched, though, doesn't mean we're finished. There's a long list of new features to add. Since I'm not sure how much of that list is safe to tell, I'm going to just say that we hope to sell more stuff and hope to add more multimedia and hope to put up some additional web-based applications to facilitate our consulting; that ought to vague enough. :P

Other Sites

I'm also working on a few other projects. A couple of them I don't really want to talk about except to say that I'm working on them. ;) However, our church web site is one of the biggies I'm currently working on. This is another Drupal site. First, the web site is currently running an old version of Drupal (either 4.5 or 4.6, I don't remember which and 4.7 is current). Second, the church has been redesigning the look of our logo, letterhead, signage, and web site. The web design has been chosen and Eric and I will be putting that together as soon as we can.

Anyway, I'm excited about that and the fact that the church will be starting a monthly newsletter to help drive new traffic to the site. I hope this will improve communication for our members, which has been a historical problem of our church (as it is with so many).

That's what I'm working on. As I said, I'm tired and now it's time to sign off.

Cheers.

About this Archive

This page is an archive of entries from August 2006 listed from newest to oldest.

July 2006 is the previous archive.

September 2006 is the next archive.

Find recent content on the main index or look in the archives to find all content.