I had a brief talk with leading MySQL developer Brian Aker today about one of the biggest turns in MySQL history: this morning's Drizzle announcement. Brian presented Drizzle as an irrevocable fork of MySQL. To me it represents four deliberate steps in one: two big steps backward and two big steps forward. It could also be the signal of a new era in databases.
The big step backward, as is obvious immediately to anyone who has followed MySQL over the past few years, is the removal of precisely the major "enterprise" features that MySQL worked so hard to add between versions 4 and 5: stored procedures (which actually were added between 4.0 and 4.1), views, and triggers. A few other features were also removed.
Aker presents this step as a return to the quick and lightweight MySQL that made it popular in the first place, a database engine that may not appeal to large corporate back offices but can easily power web sites. I see it also as a step back to the philosophy that Aker calls "Databases without business logic": let the application handle consistency and complex calculations instead of making the database do them. Trust your programmers.
The first step forward is to position MySQL to better handle the physical infrastructure of modern computing (Aker cites clouds and multicores). The second step forward is to welcome vibrant participation by the community, something that up to now has eluded the MySQL AB company (now part of Sun Microsystems).
MySQL was always free as in beer (to many classes of users), but it wasn't placed under an open license until pretty far along in its existence, Even now, the company's dual-licensing strategy drives them to compulsively hire the best contributors from the community, so that no significant base of code can build up outside of their ownership. (That said, the community has developed many wonderful utilities and some new storage engines of promise that are not owned by MySQL AB.) And they still use the powerful but proprietary tool BitKeeper (which the Linux community sidestepped long ago) to maintain their source code.
Drizzle comes out just as Margo Seltzer (a leading CS researcher and former member of the Berkeley DB team) publishes an article called "Beyond Relational Databases" in the most recent issue of Communications of the ACM (July 2008). Seltzer's complaints echo a lot of Aker's. Databases offer more features than anyone needs (like most software packages nowadays); they have correspondingly become slow, hard to administer, buggy, and expensive to deploy; and they need to slim down in order to adapt to the wide range of new hardware and applications that they have to work with.
Seltzer calls on manufacturers to make databases more modular, obviously what Drizzle is doing with its micro-kernel approach. But she also wants users to be able to choose from a menu of features. Don't need transactions? Leave 'em out and save a lot of overhead. Don't expect a lot of concurrent queries? Skip threads and run each query in its own process.
Aker does not present Drizzle as a configurable collection of options; he just promises to strip MySQL down to what he considers the essentials for modern web-driven applications. So for now, Drizzle is not the extensible database Seltzer envisions. But one can't avoid speculating about what MySQL itself could look like if the company adopted this micro-kernel approach and started adding back features (which Aker insists Drizzle will not do).
After all, forks are expensive. What company wants to maintain two completely different code bases? It looks like Aker conceives of Drizzle as a community-maintained project, with nominal support from Sun. I get the feeling this new database engine is a pet project of Aker and his community collaborators, who don't feel a need to look back or consider the long-term business needs of the larger MySQL project. But if both Drizzle and MySQL are successful over time, somebody is going to insist on coordinating them again.
Now we have to consider where MySQL is right now. Its innovative approach to multiple storage engines is definitely modular, and some of Seltzer's suggestions (such as allowing a choice between hashes and B+ trees for index storage) are present in MySQL. But I don't believe MySQL is modular in the way that it would have to be to support a separate Drizzle project. I'm sure some refactoring would be necessary to achieve the radical configurability Seltzer wants.
But I hope it happens. I think it could bring databases even more thoroughly into the next computing age. And if MySQL goes this route, other projects will do so too.