I have a confession to make - after close to a decade covering XML, I have something of a new love ... and the name of that love is Drupal. Drupal's become one of those interesting hobbies that is rapidly becoming both a profession and a passion. It wasn't supposed to happen this way ... by rights, I should be deeply in the world of Ruby on Rails right now, or learning the latest deep programming secrets of Python, but somewhere along the line I realized one of those ugly little fundamental truths that good programmers should never actually learn - that at some point, recreating the wheel yet again begins to lose its luster, and, indeed, become rather ... well ... dull.
The fundamental issue I've faced with Drupal is one that was actually brought up in a talk about successful Open Source architectures by Roy Fielding at OSCON 2008, entitled Open Architectures at REST. Fielding, of course, knows a great deal about both REST and architectures, as he was one of the key architects for the Apache Web Server and the person whose doctoral thesis made REST something beyond what you do when you go to sleep.
Fielding's talk discussed a number of characteristics that most successful open source projects have in common, what he termed Open Architectures. Such architectures reflect the model of decentralized software evolution, in which the role of the core architecture is to act as a platform on which the community can effectively build functionality specific to their needs. In this model, the role of the project team is to provide continuity of that platform and to give to the community the tools to build their own subordinate modules (extensions), as well as providing the messaging architecture that makes it possible for these extensions to play nicely with one another.
Fielding cited a number of examples of such open architectures, including Apache itself (with all of its mod_foo extensions), Firefox and its myriad XPI extensions and the Eclipse IDE with the thousands of plugins that have made Eclipse one of the most widely disseminated IDE on the planet. In each of these cases, the role of the architectures was to insure that the base platform was capable of supporting the extensions, often at multiple levels of abstraction, and that changes in that foundational infrastructure could be made without completely wiping out the community-developed plugins.
To this list I would also hazard to add Drupal (http://www.drupal.org) - which is one of the reasons that I've become so bullish about the software. For those not familiar with Drupal, it started out as a PHP-based bulletin board software in 2000, but the idea of making it extensible was on the radar even then.
The core concept of drupal is simple - you can represent a "content type" as a bundle of information including a title, a body, and links to that body's type, author and publication metadata. This bundle of information is known as a node. A general theme (made up of one or more CSS pages and one or more PHP-based templates) let you establish the presentation aspect of the node.
One significant aspect of such nodes is that each node can be shown as a RESTful URI. For instance, all nodes can be represented in the form http://www.myserver.com?q=node/n, where n is the node identifier number, or nid. In some servers (and with the right support on the server) you can even dispense with the query string notation altogether, so that the node would be given as http://www.myserver.com/node/n.
Additionally, however, that same node can have one or more aliases. For instance, suppose that the node in question was an administration page that showed the content listings of all of the nodes. This page might (and in fact in Drupal does) have the alias admin/content/node. Internally, Drupal maintains a table of such aliases to map to the underlying node.
However, Drupal is also very RESTlike in that you can even extend these aliases. For instance, the alias blog identifies the page that lists all of the blog posts on the system (for those Drupal instances where blogs are enabled). However, if you add the author id after this, such as blogs/4, then you can get the list of blogs for just the author with the id of 4.
What makes this so powerful is that users to the system can create these aliases just as readily as the site creator (who is in fact just a specialized superuser). It also means that extensions to drupal can take advantage of this to add new functionality without getting in the way of existing functionality.For instance, you can create your own new content types - such as an article type or even something wacky like a game character type with a name such as game_character, then adding a new game character becomes as simple for the user as going to a URL called node/add/game_character.
The Drupal community started out in blogging, but just as the Eclipse space exploded because of easy to develop user extensions, so too did the Drupal community begin to go ever farther afield because of user created modules. The Drupal core development team stayed abreast of those modules, and when a module emerged that became widely adopted, would take the module in, clean it up so that it fit as the most minimally invasive piece of functionality possible, and would then add it to the core.
Periodically, revisions would also be made to the core itself in the name of providing better architectural support, but this idea of minimal invasiveness applied there too. Backwards compatibility was maintained, but only within major levels - i.e., a plugin that worked on version 5.2 should be able to work on version 5.14, but not on 4.7. When a major version upgrade occurrs (always transparently and with plenty of time in beta), this forms the floor for a new generation of modules, some of which are upgrades themselves, some of which are new.
This in turn has had some interesting ramifications. By splitting the work in this fashion, the upgrade process no longer becomes a necessity for people - if you've built a stable site, then you can count on occasional security updates on the older branch, though over time the number of new modules for that branch decline to zero. However, this is a process that occurs relatively slowly, and in many cases similar functionality will be enabled in modules for two or three different major releases at the same time.
Thus, at the time of this writing, most module development has stopped on the 4.7 branch, the 5.x branch is still quite active, the recently published 6.x branch is now stocking up on new modules, and the 7.x branch is in beta development. In many ways this is the essence of open source agile development - there is no true "development" phase, only a continuous (major or minor) maintainence phase.
What makes all this effort so worthwhile is that Drupal has now reached a point where you can create a sophisticated web portal without having to know how to program precisely because of this modularity. With Drupal, I can use the Content Construction Kit (CCK) in order to build new content types, can create forums and community blogs, can use blocks and panels in order to position content in specific areas on the page (and establish parameters on those pages), can work with views to create the relevant database queries that let me create a page of gallery images or videos with specific keywords or RSS and Atom feeds, can work with taxonomies that let me build tag clouds and dynamic navigation, and so on ... and I don't need to write a single line of PHP or SQL code to make any of it happen.
I can write such code if I want to - the extensibility mechanisms for Drupal are well documented and are generally not that hard to work with - but the importance here is that I don't have to write that code. Indeed, for all that I thoroughly enjoy working with Ruby, one thing that's rubbed at me for a while is that Ruby on Rails is still based upon this paradigm that you have to write code in order to build a site, which means that Ruby will always be of use only to those people who can write Ruby code in the first place.
With Drupal, that's no longer a requirement, and it means that people can get Drupal sites up and running quickly without needing to understand the first thing about programming - and if they can't find a module that does what they need (and with nearly 4000 such modules available just on the Drupal site alone this is becoming less and less of a problem) then they can find a programmer that will create just that functionality without having to rebuild the entire site. It's one of the reasons why Drupal is beginning to become the de facto environment for smaller news organizations and PR departments.
That doesn't necessarily mean that Drupal is either easy to use or doesn't come at a price. There are a number of fairly radical new concepts that Drupal uses that take a certain amount of time playing with it to fully understand. Moreover, while there are fairly strict guidelines for making modules, there are modules that get published that don't necessarily play nice with your underlying data, and debugging Drupal when something doesn't work can be a challenge in its own right.
It can also become instable (especially the 4.x branch) and tends to consume a fairly inordinate amount of memory once you start piling on the modules. This tends to be the bane of all open source community projects - the flexibility inherent in extensibility comes at the cost of module overhead and difficulty in the kind of optimization that dedicated systems can employ.
Yet if its a choice between optimized dedicated code and modular community code, in the long term, the latter alternative will always win. At least for the foreseeable future, the cost of memory and hard drive space will continue to fall, the move towards multiple parallel processors will increase the overall throughout of such applications and network latency will continue to drop as broadband becomes the rule rather than the exception.
This means that in general flexibility trumps efficiency. When a particular module becomes critical to a large number of people within the community, it is also usually a candidate for the core, where the component is optimized to better work within the core architecture. Indeed, in some cases, such as the Flock browser (http://www.flock.com) or the Songbird media player (http://www.getsongbird.com), it was the core architecture itself that was "spun-off" and optimized for community building or media playing compared to the original Firefox focus on web browsing. That both of the spin-offs still retain (and actively court) community module development attests to how effective this model can be even when the core is changed.
As to Drupal, it has effectively become to web portals what Eclipse is to application development, and has the potential to significantly challenge Microsoft's Sharepoint (http://www.microsoft.com/Sharepoint/default.mspx) or similar commercial portal applications in that space. It will also be a topic that O'Reilly Media will be following closely from here on it.
Kurt Cagle is Online Editor for O'Reilly Media, focusing on web services, web applications, XML, publishing systems and so forth. He lives in Victoria, BC, where he hangs out with his daughter and the mastodon at the Royal BC Museum.