October 2005 Archives

Okay, so I'm really close to making a real release of Contentment following it's latest gutting and rebuilding. This time it looks nice. I owe a lot of the latest bit of nice-ness to Richard Hundt and his nifty Oryx persistence library. I discovered his bit of ingenuity via the CPAN RSS feed.

Anyway, I needed something that Oryx doesn't have (or doesn't have yet): cloning. For my readers which aren't programmers (and haven't already dropped out), cloning is just the word programmers use when we want two copies of the same data. Okay, so after a very small amount of effort I have added cloning.

The reason I need cloning is because I am adding the ability to seamless create object revisions. That is, it's nice when you are building a publishing system to be able to keep old revisions of data around. That way, if someone makes a change to a page that obliterates some important information, you can go back and fetch that information from an old revision of the document. I want this to happen automatically because I don't want folks who are building on to Contentment to have to think to hard about how it works. I just want it to work. I've discovered too often that if something isn't easy to do, it won't be done.

Thus, I want my code to look something like this:

my $obj = Foo->create({ title => 'Foo Version 1.0' });
$obj->title('Foo Version 2.0');
$obj->update;

After that code snippet finishes, there should be two foo objects stored in the database. One will have the title "Foo Version 1.0" and the other with the title "Foo Version 2.0". At the same time, I need to make sure that continued use of "$obj" uses the newer object. Which means that suddenly I have to make sure the entire internal state of that object now refers to the new one. Since the objects use Oryx, this would mean a lot of dinking around with internal state that could change with every new release of Oryx—i.e., a very bad idea.

I could create a special new method for created new revisions, but I don't really want a different syntax when that one I just showed is already crystal clear. The old revision isn't obviously in the database because searching for it will still only return the latest revision unless you specifically want older revisions. I'm not changing the interface so much as the implementation. Why should I muck up the interface?

Enter the power of Perl aliasing. In Perl, alias variables are special variables that don't really have any storage for themselves and exist only as second (or third or fourth...) names for existing variables. Aliases are a lot like references in C++. Also in Perl, variables are passed to subroutines in a special array named "@_". Each element of this array is an alias to the passed information.

For example, if I run the following snippet of Perl:

my $foo = 1;
my $bar = 2;
foo($foo, $bar, 3);

The special "@_" variable in the foo() subroutine will be set so that $_

package Foo;
# Return a simple empty Foo object
sub new { bless { }, 'Foo'; }
# REPLACE myself with a new Foo!
sub replace_me { $_[0
 = bless { replaced => 1 }, 'Foo'; }
package main;
my $foo = new Foo;
print "$foo\n"; # empty Foo object
$foo->replace_me;
print "$foo\n"; # a different Foo object!!!

This is wicked and probably not a good idea in most cases, but it demonstrates how my revisioning code will work. The output of my code should look something like:

Foo=Hash(0x1801434)
Foo=Hash(0x18014a0)

Notice that the addresses listed after the type are different! By replacing $_[0] is the replace_me() method, I've overwritten $foo with a different value.

Therefore, all my code needs to do to make everything work after the call to the update() method, is to replace $_[0] with the newly created record object. I don't have to muck with any internal state and the API remains unchanged.

However, this does result in some potentially major-bad pitfalls that I need to be careful about—in fact, they might be bad enough that I'll have to find a different way to do this. However, I thought the concept was very interesting.

Pitfall #1: I better let the user know in the documentation that this happens or they might have problems that I can't anticipate.

Pitfall #2: If I switch the references without updating the original and some data structure uses a reference to the old object, the old object won't be properly updated. I don't think this is a major concern because there's only one place these objects should be directly referenced and I can control and update that one in the process of the update() method. Any other code that needs to create a reference to the object shouldn't be referring to these objects directly, if they are to properly obey the documentation given.

Pitfall #3: There are surely other side-effects that I haven't thought of yet.

I think the Perl aliasing functionality is one of those nifty but dangerous features. We'll see if I actually end up using it the way I've presented here...

In my morning scan of the news I bumped into this article over at NewsBusters.org. The article is a commentary (or really, mostly a quotation with a few parentheticals) of a New York Times editorial on the nationwide decline of newspapers.

I was thinking about this and a quote from I, Robot came to mind, "I don't know, maybe you would have simply banned the Internet to keep the libraries open." The quote implies that the Internet killed the world's libraries. However, I think the opposite might be true. I think the Internet could actually work toward revitalizing libraries over time. With reading becoming increasingly important, I think we may find motion media declining and a renewed interest in novels. Electronic books have really not caught on very well. Though, I think there will be changes to come in the print industry. I also think that in another dozen years or so the Internet will be a major source for independent motion media that will start to stomp on traditional television too.

On the other hand, with the blogosphere and other Internet news sources gaining credibility and as the opinion of traditional news sources are increasingly scorned as biased and out of touch, I can see newspapers suffering greatly. I even just proved it, I get all my news from the web. Why bother flipping pages of a paper allowing me one point of view, when I can read hundreds of different sources each with their own unique bias by flipping open my laptop?

Anyway, I make no predictions, but I think the problems are interesting. The social issues faced by my parents' generation were interesting and I think this generation is going to face some interesting upheavals of its own and this is just one of them.

The Official Google Blog is on my regular reading list as I tend to find their solutions to problems interesting. Today's post announced the latest feature in Google Labs: Google Reader.

This came just at the moment I was in the midst of switching back over to Firefox because the 1.5 Beta is nifty. Thus, I was losing one of the browser features I like in Safari, which is the RSS reader. I'm afraid that the Sage RSS for Firefox doesn't really suit me. Thus, I was ripe for something else to try.

The interface itself is a bit too much. It looks nice, but it stutters a bit, and not everything is linked up quite right. The OPML import was nice, because I was able to export my Sage RSS in OPML and then import it into Google Reader without any difficulty. However, I added a couple more feeds and the way subscriptions are added is a little disorienting. It took me a couple tries to figure it out. Also, when I added a label to a feed I added in one place, that label wasn't applied. I had to edit the feed using the link up top to get the labels working. If you want something with the same polish as GMail, don't use Google Reader. It's not there yet. We'll see how I like it in a few days.

One of these days, I'll probably take the time to setup Jesse's BlogBucket, since it looks pretty sweet.

I just read an interesting post on Slashdot about a quote by Linux creator Linus Torvalds that I think is just excellent:

"A 'spec' is close to useless. I have _never_ seen a spec that was both big enough to be useful _and_ accurate. And I have seen _lots_ of total crap work that was based on specs. It's _the_ single worst way to write software, because it by definition means that the software was written to match theory, not reality."

I totally agree. If you want to understand something you simply must adhere to the Masaaki Mantra: "You must read the code." Documentation is helpful in getting you part way there, but there's no way in hell that the documentation 100% matches the code. This is the essential problem with Microsoft and other closed-source software. The documentation is usually close, but rarely on target. I can't see the code to know what's really happening.

It is my opinion that the best "specification" for code is more code. This is the basic philosophy of Damian Conway's Perl Best Practices (see Ten Essential Development Practices for a summary). This is a software engineering strategy with which I can agree. (I despise all that UML crap.)

Basically, if you want your code to work well. Design your code interface first. Build your tests second. Finally, fill in the details of your implementation. I'm not in a good habit of building things this way, to be honest, but these are habits every developer should work on. As part of interface design, test writing, and implementation, you should be documenting and updating your documentation. Then, your documentation can help a programmer get started, but when they want answers your docs don't contain, they can read the code. By examining the interface directly and looking at your test cases, they can understand what you intend. Finally, by reading the implementation, they can find out how it really works.

Mr. Torvalds, you rock. Software specs suck!

About this Archive

This page is an archive of entries from October 2005 listed from newest to oldest.

September 2005 is the previous archive.

May 2006 is the next archive.

Find recent content on the main index or look in the archives to find all content.