Phew! This has been a whirlwind week and I can now break the silence and talk about it openly. Today, I gave Boomer Consulting two weeks notice and signed my shiny new contract to start work with Grant Street Group. This has been simmering for a bit, so I’ll tell the story of how we got here.

A little more than two years ago I left Kansas State University to give up systems administration to seek a job in software development. I had some pretty strict limits on what I was looking for location-wise (Manhattan, Kansas only) and a friend of mine let me know that his company, Boomer Consulting, was looking for a software developer. The job was to be primarily customizing a Java-based CMS called Magnolia. I interviewed, was offered a position, and took the job. From there we added PHP-based Drupal and some other Perl-based apps to the list of things to do, which was quite a lot for a lone dev. Things at Boomer have been rocky from time to time, but overall it’s been a good experience. In the end, I’m leaving because Boomer is refocusing how it improves it’s software and I no longer feel like I’m a good fit for this direction.

When I determined that things at Boomer were drawing to a close, I began searching for positions. This time around I was a little freer to decide on location, but I still had a very strong desire to stay in Kansas or at least a neighboring state. My two main goals in this job search were to find a job primarily working in Perl, since I’m having fun with it and it’s a focus for me right now, and either a job in Kansas or nearby or would allow me to work from home. I even considered a couple positions out of the area to make sure I wasn’t missing out on anything important, but only if it looked like a really good opportunity.

I talked with a recruiter for a hospital informatics company in Nashville. I also talked with a certain movie database company in Seattle. However, neither opportunity was exciting enough to justify uprooting our family. As much as I love the Seattle area and as much as the housing market in Nashville is appealing, extended family and our friends in town are too important to leave for a so-so job opportunity.

Other than those and the company I’m moving toward, I also spoke with the folks at a certain wiki company in California and a company that develops software for building surveys for Fortune 500 companies. All of these were telecommute positions that would allow me to work almost entirely from home with the possibility of travel to home offices a few times every year.

This past week I actually interviewed in both Seattle and in Pittsburgh (that’s PA not KS, which is Pittsburg without the “H”). This was a lot of flying and odd sleep patterns, which unfortunately ended in a migraine on Saturday. In Pittsburgh, I had a chance to meet with the folks at Grant Street Group and I got a really good feeling from everyone I spoke with. The offer they gave me is exciting and, frankly, better than I really expected. Therefore, I will be learning the ropes for Grant Street Group in the next few weeks.

Anyway, I wanted to let all my friends and fans (all 4 of them, cough) as soon as possible because I’m kind of tired of being discrete about the fact that I was searching for a job.

Cheers.

I read an article at Ars Technica quoting a comment Bill Gates gave to a question during a speech he gave on pharmaceutical research. The answer shows that he is both badly informed and he’s got a very elitist point of view regarding Free Software and Open Source.

His first mistake was making the statement that there’s a difference between “free software” and “open source.” For those of us that know the history of what is most commonly called “Open Source” today was originally called “Free Software” and this “Free” was advertised using the motto: “Free as in speech. Not free as in beer.” Gates mistakenly uses the term “Free Software” to refer to software given away gratis, specifically dropping that they give away their software in developing countries. Of course, he wasn’t referring to “Free Software” as Richard Stallman coined the phrase, but since he mentions Stallman’s license, the GPL, shortly afterwards, it seems awkward that he would do this. This was a mistake and shows that he’s somewhat misinformed on the topic.

Second, he states that the GPL prevents anyone from improving the software. This could not be further from the truth. In fact, Microsoft’s own proprietary EULAs are the licenses that prevent improvement. The whole reason Stallman began the Free Software crusade is because he was tired of finding bugs in proprietary software that he wasn’t permitted to fix because the license forbade it. Any software written in the GPL can be fixed by anyone, anytime. Software under a Microsoft-style EULA may only be fixed by Microsoft or a licensed partner.

Now, what Bill Gates really objects to is the fact that software written under the GPL doesn’t grant him the kind of flexibility he wants to exploit the profits of his software. By holding a lock grip on who may and may not modify the software people depend upon, he’s guaranteeing that everyone has to pay him money to do anything: monopoly.

The GPL allows businesses to compete not only to produce the same piece of software, but to improve it and support it. You can’t get that from Microsoft’s proprietary software. If there’s a bug in the software itself, you have to pay Microsoft to fix the problem. If you don’t want to pay Microsoft, you have to find another solution or workaround the issue, which is what is usually done instead.

Bill Gates might argue, then, that he still isn’t reaping the full benefits of the innovation his company produced. However, I would argue back that if his company still manages to provide the very best support for your software, then that’s who the folks that want the best support will talk to. Plus, his company has the opportunity to benefit from the work of others at very low (i.e., free as in beer) rates. Opening up forces his company to be good at what it does, not just be the sole gatekeeper that we all have to bow to in order to get our work done whether they’re any good at it or not. That’s reality, Mr. Gates.

Cheers.

Sorry for the Sir Mixalot reference there. Ahem. Anyway, I just had a coworker walk by my desk and say something like, “Don’t you just wish we could get rid of all these glitches and be done with them?” She left before I could answer and I think it was rhetorical (it’s hard for me to switch gears to interpret social signals while in the midst of concentrating on code). My answer was going to, “Uh, no. I like finding bugs.” In this sense, I’d agree with what Nat Torkington twittered recently (edited for language, I try to run a PG rated blog):

acid test of whether I’m still a hacker: do I think “oh goody!” or “oh [skittles]!” when I find a bug?

Not only this, but a few years back I worked for a company, NRG, that shared office space with another company owned by John Devore. John and I were chatting one afternoon and we were talking about debugging things. He said that when he finds a bug, he doesn’t just want to fix it. He wants to know why he didn’t find it before: why does the bug work so well except here? That’s certainly my passion.

Debugging is an interesting opportunity to learn. Not only can you learn from the mistake, but you can then use that mistake to actually become an enhancement in the future if it happens to be interesting in some way. Sometimes, you can even end up turning a bug into a feature if it happens to do something really cool with a slight change.

Anyway, I just wanted to post that I enjoy debugging and when I get to the point where I say, “Ah crap” when I need to fix something, it’s time to switch careers.

Cheers.

I’ve been taking DDJ for a couple years now. It’s cheap and occasionally has something interesting in it, but it’s been less interesting than I remember it being when I read it in college. I’ve been much more enamored with the Communciations of the ACM. Today, I received my issue and there’s an interview with Paul Jansen of TIOBE Software. In the article, he’s quoted saying:

Another language that has had its day is Perl. It was once the standard language for every system administrator and build manager, but now everyone has been waiting on a new major release for more than seven years. That is considered far too long.

While I am biased, I have to admit that I disagree pretty strongly with Jansen’s assessment. First, let me go into the problems with how he came to this conclusion and then explain why I think I’m justified trusting that Perl is in it for the long haul despite my bias that would have me think so anyway.

I want to first evaluate the way Jansen has collected the data he’s used to make this statement. TIOBE puts together what they call the TIOBE Index. This is a rating of the popularity of various programming languages. The TIOBE web site claims, “The ratings are based on the number of skilled engineers world-wide, courses and third party vendors.” How do they measure this? By performing a search for:

+"<language> programming"

on 5 popular search engines, including: Google, Google Blogs, MSN, Yahoo!, and YouTube. That’s it.

What they are measuring is not actual popularity, but the amount of hype surrounding each one. Not only are they measuring hype, but only hype that discusses “programming”. What if everyone prefers to say “programming Perl is fun!” That wouldn’t get picked up by the search they use. What about “Perl scripting”? Nope. Missed. (Here I should point out that Andy Lester appears to have been on to something when he gave his lightning talk about Perl programs versus scripts at OSCON last year.) In essence, this is, if they’re disclosing the complete metric, incomplete. It’s a shortcut that might be 90% right or 50% right. This is just poor statistics.

The second aspect of Jansen’s comments I take issue with is the statement that there has not been a major release in seven years. That’s not strictly true. Perl 5.10 has just been released and it includes new features like the new smart match operator. Beyond that, there has been some very active development on a closely related project, Parrot, and language development toward a huge milestone, Perl 6. Furthermore, where Perl truly shines is in all the development on CPAN. CPAN is getting large and complex enough now that we’re having to rethink how it works just so we can find anything on it. This is a good problem to have.

This comment by Jansen does, however, serve to indicate a certain perception gap caused by the long wait for Perl 6. It’s even been considered that the name of Perl 6 is harmful to Perl 5. This has been discussed out by others for some time.

In my opinion, Jansen is on shaky ground with his claims and probably only because he’s not well informed by anything but his own metrics. I should think that he’d at least research the trends and issues facing the top 10 languages listed by his survey as to provide some better justification for it’s accuracy.

As for the reasons I still have warm and fuzzy feelings toward Perl’s future, I can list them off rather easily.

  1. I am participating in a number of growing projects that depend on Perl’s future. Jifty and rethinking-cpan are just a couple I’m involved in. I can point you to several other vital projects that I use or am familiar with.
  2. I know of several companies actively pursuing Perl to develop core projects and continuing to train developers. This includes imdb.com, Socialtext, Best Practical, Six Apart, and several others.
  3. Recently, Google launched Google App Engine. This tool provides services to Python developers as part of the initial release. The top most voted for issues are first to add support for Ruby and second to add support for Perl, as of this writing.
  4. There’s an average of 50 new and updated modules being posted to CPAN every day. That’s not a small number.

I can probably come up with more, but now it’s getting late, so I’d better end this thing. If Perl is going to die, it’s got some years left before it happens. I think there will be enough activity to keep it going and increasing during those years rather than dying.

Cheers.

Google StreetView is cool, but mildly disturbing. Since Manhattan hasn't been cataloged yet and I can't show you my actual house, here's a view of the house I grew up in in Lawrence. If you follow around Lance Court you can see one of the owners since we lived there added a big fence and a pool around back.


View Larger Map

Anyway, I thought it was interesting enough to share with y'all.

Cheers.

Here's a nifty little slideshow of my Mac screen from a few minutes ago while the compositing engine went haywire. Basically, I went to lunch, came back and when I moved the mouse to get rid of the screensaver I was getting some really nice visual effects. Maybe I should have taken a movie because there was some nice animation going on too.

Fortunately, a reboot corrected the problem.

Cheers.

I have a certain frustration with due dates. I have determined that this frustration stems from a misunderstanding of the nature of software and their timelines. I am, therefore, banishing the terms "due date" and "deadline" from the realm of software project timelines (as a lot of software engineering literature has wisely already done) and insist that the word "milestone" be used in its place. Why?

When you decide to create a piece of software to perform some task, you are starting down a path that has no end. Once a software project is created, that project immediately takes on an eternal soul. If you want to add feature X to the software by deadline Y, you will soon find that even if you succeed in getting this feature in by the deadline you still have more work to do on that feature after the deadline. Why?

Features are nebulous little critters. Even when you think you have them well defined, you find out they are more complex once actually implemented. For example, I'm in the process of converting our server infrastructure to use a tool that will help us scale upwards as we add more clients. We aren't done once we move our servers to this new infrastructure because we'll still need to retool it and play with the optimization settings to make sure it's performing well. This will continue for as long as we have the servers.

There's no finality on a software project ever. It will always require a little more maintenance until the business decides we're not doing that anymore. Even then, depending on the project, we may need to do some level of support down the road. I would be surprised if Microsoft doesn't still get support calls for DOS or Windows 95, which have been dead products for years.

My conclusion: A software project is never finished. A deadline is not a deadline, it's a milestone. "Deadline" and "due date" imply finality. A "milestone" is just another landmark along the path, a much better description. There's always one more bug to knock out. There's always one more improvement to be made. Anyone who thinks otherwise is simply naive.
Andy Lester has issued an interesting challenge regarding CPAN. First, I have to say that CPAN is probably at least one-third of the reason I love Perl. Having a centralized, search-able, browse-able, and documentation-oriented repository of modules makes development so much easier than any other alternative I'm familiar with. In addition to all that, it has reviews, forums, a bug tracker, test result summaries, etc. That said, it is far from perfect.

There are two main complaints that I get. The first is from friends who are systems administrators. They often despise CPAN because they don't care if it helps developers easy locate and include modules. All they see is that to install some application, they have to install 15 modules, which installs another 22 dependencies for those, and then another 18 for those, and then another 12 for those, and then another 5 for those and then finally 2 more. This can take hours to build and install all of them. This is just a hassle for them.

From a developer perspective, I also have the problem that Andy highlights: which effing modules solves my problem? For example, if I want to build Subversion/CVS/Git style commands with sub-commands I could use App::CLI by clkao or App::Cmd by Ricardo Signes. Which one is better for my problem space? Both are by smart developers and both are well put together, but they are completely different implementations for a similar problem. This is something that CPAN won't help you with very much. Andy's example of XML is even more complicated.

There are other problems as well. Here are some ideas I've had that could be implemented in individual chunks that could help.
  • Wikifying the documentation. This is an idea I've thought some about and even worked on designing an implementation for a bit over my Christmas break. This is basically about allowing visitors to contribute POD to a module on CPAN using a wiki-ish interface. This could improve the documentation. The challenge is that CPAN documentation is generated out of the POD stored in modules themselves. There needs some way to make sure that the POD updates made on the Wiki site can get back into the modules easily. I developed what I think would be a working solution to that problem, but requires the author to pay attention to the wiki and make sure to download the patches. Build tools could help automate this process, though.
  • Dependency mapping to cluster modules. Every module contributes a metadata file that should list the dependencies required to run that module. I can think of several useful things we can do here on a large scale. We can use that to suggest which modules are more popular or better based upon how many modules depend upon it. We can use this to indicate (or estimate) how many overall dependencies a particular module has, i.e., you will need to install 184 module dependencies to use this module. We can also tie this dependency information into other features like reviews and ratings to generate other helpful statistics and heuristics.
  • Incorporating use statistics. Sites like ohloh, or iusethis have a good idea in how they can rate content. It's not a perfect system, but letting a person just click a counter that says, "this is good" or "I like this" provides a very simple mechanism to rate a module. In addition, it allows you to gather statistics grouping modules again in another way.
None of this is a new idea. None of these are really very innovative, but I think they would be steps in the right direction toward making CPAN work better and better. If we can incorporate additional metrics into how the search works, adding tagging, or who knows what else along the way would go a long way.

I wonder if we could expose the CPAN services currently available in some sort of unified web service that would allow developers to try and enhance and experiment too. This could make it so that CPAN grows in new ways without a lot of overhead to get TPFs attention or what-not.

I don't know. More ideas...

Cheers.
Okay, so now that I have my fancy new blog, it's time to start blogging again. For the past month or so I've been a little short on spare time, but what time I've had to spare has been spent playing with two new toys: Twitter and OpenResty.

Lance got me hooked on Twitter and it's pretty cool. For those that don't know, it's kind of like a Facebook status if you know what that is, but it's more than that. It's also like have a public instant messenger that you send to everyone and everyone you're following sends to you. It's also like having really short blog posts. Anyway, it's kind of cool for those quick things you find and think, "Dang, that's neat, I'd like to share that with my friends." Things I used to post to the K-Slug channel on IRC no go to Twitter. Now, I just need to get more of my friends on there to see them. :)

The other thingy I've been messing around with is OpenResty, which is somewhat of an intriguing solution for serving a RESTful web service. Basically, it's just a web service API that allows you to create database tables, rows in those tables, users, control access to the tables, etc. It provides no UI at all, just the ability to GET lists of or individual resources, POST new resources, PUT updates resources, and DELETE resources. You can then build a front-end to the data using whatever application server you like, client-side JavaScript, Adobe AIR, etc. (If you want the buzzword, this is very "Web 3G"---I'm now looking for a waste basket to puke in.)

Anyway, I'm kind of in search of a problem to solve using OpenResty to find out what it would be really helpful for and I'm just really intrigued by what it offers. But that's what I've been dinking around with lately.

Cheers.

I'm currently experimenting with some stuff related to using JavaScript on the server and the client. The results I'm getting are somewhat mixed. The idea I had was to see about letting someone customize some aspects of an application I'm developing that is intended to help you live more healthy. Basically, based upon the statistics of a food or other parts of the application, you could create custom metrics to measure your success or failure according to your particular preferences. (Some people monitor carbs, some monitor Weight Watcher points, others are interested in fat and protein, etc.)

I've successfully added tools that will allow you to add your custom metrics to the system using your own JavaScript snippet. This snippet is passed a object representing the food item (and eventually to include other information). I can then calculate your metric based upon the current food your viewing and your code.

This calculation is performed on the server side using Mozilla Spidermonkey
and the JavaScript
module on CPAN. On the server, this ends up looking something like this:

use JavaScript;
my $runtime = JavaScript::Runtime->new;
my $context = $runtime->create_context;
$context->eval_file('benchmark.js'); # base API library

# do some other setup of the objects I need in __tmp.food

$context->eval(
q/function evalIndicator(food) {/
.$user_calculation # this variable contains code from the end-user
.q/}/
);
$result = $context->eval(q/evalIndicator(__tmp.food)/);

I've added a little more in there for error handling and such, but that's the core of how I handle the work on the server. The really nice thing here is that the user's JavaScript can't really do much bad other than a denial of service by looping or recursing internally. The server-side JavaScript doesn't include any BOM objects that could be used to do terrible things like steal user information, etc. Furthermore, if I run this code as a separate process that communicates back to the server, I can easily wipe it out if it takes longer than a few seconds to work.

Then, using a combination of Dean Edward's sandbox
and some custom hacking of my own, I can then provide on-the-fly updates from the client using most of the same JavaScript code. This allows the JavaScript to run in a relatively safe container within a hidden IFRAME.

On the client, I've implemented the sandbox in this form. I've left out some of the error handling and other details for simplicity:

// Dean Edward's sandbox in OO form:
function Sandbox(preloads) {
  this.iframe = document.createElement("iframe");
  this.iframe.style.display = "none";
  document.body.appendChild(this.iframe);

this.frame = frames{eval:function(s){return eval(s)}};"+
"top=opener=parent=null"+ // cut off access back to the parent
"<\/script>"
);
this.frame.document.close(); // stop the throbbing
}

Sandbox.prototype.eval = function(code) {
var sandboxObject = Sandbox.functions[this.serialNumber

;
return sandboxObject.eval(code);
}

Sandbox.prototype.close = function() {
document.body.removeChild(this.iframe);
Sandbox.functions[this.serialNumber] = null;

this.frame = null;
this.iframe = null;
this.serialNumber = null;
}

Sandbox.serialNumber = 0;
Sandbox.functions = new Array();

Now that I have the sandbox to run things in, I need to setup a similar environment as I use on the server. My initial implementation embeds the user's code as a special attribute in the middle of the page, but it could be special block in the header or an Ajax request once to fetch the code string into a variable just after load, etc.

// This code runs onload, to make sure benchmark.js is loaded completely
var sandbox = new Sandbox(['benchmark.js']); // base API library

// When the user changes a value inthe page, the calculation can be made again on the fly...
var result = sandbox.eval(
""

// do some other setup of the objects I need in __tmp.food

+ "function evalIndicator(food) {"
+ user_calculation // this variable contains code from the end-user
+ "}"
+ "evalIndicator(__tmp.food)"
);

Again, I've omitted some of the other details to keep this article at a manageable length. That's the gist of how I run this on the client.

Now, after I've done the work, I probably am going to drop the client-side effort and provide updates by contacting the server using Ajax-ish requests. Why? Well, I can control the server-side environment much more closely by running the code in a separate process that I can kill off if it runs away. If someone's code accidentally (or intentional) tries to do something malicious on the client, I get the blame. The browser is a much less controllable environment. They could fire off requests passing information regarding the end user off to an unknown third party or compromise something even worse if I'm not very careful. I'd have to do a very thorough job of stripping the sandbox area completely. Even then, I don't know if I can actually protect the user from a malicious attacker since there are so many unknowns regarding browser implementations. I'd need to do a lot more studying before I'd be willing to run untrusted code this way for sure.

On the other hand, this trick might work swimmingly in an environment where I trusted all the code being executed in both boxes. I'd certainly be willing to do this on a project at some point to provide calculations on both server and client using the same embedded code, which is why I'm writing the article for others. Anyway, this has been an interesting adventure in JavaScript.

Cheers.