Last week Henrik announced that he was stepping down from Drizzle development to pursue work on Solon Voting. I must considerately step down too, for after many long months and late nights, my startup has started up: Test Noir. Between working full-time at Percona and Test Noir I have zero time for any other coding. It was a pleasure and an honor to contribute to Drizzle.
As part of Google Summer of Code 2012, Anshu Kumar and I are working to make Drizzle plugins dynamic, i.e. making plugins’ variables configurable at runtime. We have started with regex_policy, and progress is good. As I noted to Anshu, this project is part new design and part redesign, which presents many challenges. As for new design, there is the problem of maintainability: will Anshu’s code today be understandable and maintainable in another year or two? With good design, adherence to the Drizzle coding standards, documented code, and tests, I think it will be. As for redesign, regex_policy was originally designed to be static, but now we’re making it dynamic, which means we’re redesigning its modus operandi. The plugin works as-is, so we must preserve that while at the same time introducing our own changes and hopefully not introducing new bugs. A new book, Code Simplicity, reiterates the importance of making minimal changes and incremental redesign, and I agree. Making variables dynamic requires redesigning some aspects of the code, but by and large we avoid redesigning the plugin and its classes.
In addition to the end-goal—making the plugins dynamic—I think it’s very beneficial that at least two sets of eyes are going through major plugins’ code carefully and systematically because it amounts to a free code review. Many plugins are years old, and no one (to my knowledge) has really delved into them since their inception. Who knows what we’ll find.
For the record, Anshu is doing all the work and I’m just mentoring. That’s good because it forces me to see and think of things anew. Lately, I’ve spent more time on the Drizzle docs than plugin code, so it’s nice to return to C++ from so much RST. I have encouraged Anshu to blog frequently, so hopefully we’ll all get a new look at the Drizzle plugins, their code, and developing with Drizzle in general.
sudo apt-get install g++ gperf intltool libprotobuf-dev protobuf-compiler uuid-dev libpcre3-dev bison python-sphinx libboost-all-dev libcurl4-gnutls-dev libpam0g flex libcloog-ppl0
Drizzle Day 2012 happened last Friday, April 13, 2012 after PLMCE 2012 in Santa Clara, California. I also realized that my first anniversary as a Drizzle user and contributor has already passed: on April 10, 2011 I wrote my first Drizzle blog post: Compiling Drizzle 7 on Mac OS X 10.6. (This blog didn’t exist at that time so I used Hack MySQL.) It’s cliché to say but I’ve learned a lot in the last year.
First, I want to apologize to the core Drizzle developers for speaking without first knowing all the facts. In What Drizzle needs I said the project lacked and needed leadership. Although it does need a certain type of leadership as all projects do, my thinking at the time, which was reflected in the blog post, was that Drizzle was a ship without a captain or first officer, so to speak. I know now that that criticism was too harsh given what was happening to the core developers around this time last year. Furthermore, it is still too much to expect someone to work full-time on Drizzle because I also know now on what most of the core developers are busy working, and they’re quite busy indeed. So what can be done in term of leadership? It occurred to me that for the moment Drizzle doesn’t need traditional, top-down leadership because the core developers have known each other long enough and are talented enough to be self-directing. In essence, a “hive mind” is driving Drizzle. This became apparent to me after Drizzle Day with Brian, Mark, Stewart, Patrick, and Henrik in one room all day: a harmonious team doesn’t need “direction from above”; everyone does what they can to help realize the project’s goals.
Speaking of the project’s, i.e. Drizzle’s goals, one idea seemed to be prevalent at PLMCE 2012: cloud services. In Brian Aker’s keynote he demonstrated provisioning a MySQL server via HP Cloud; and in Mårten Mickos’s keynote he talked about new paradigms: client-server (scale-up) to web (scale-out) to cloud (multi-scale). These talks along with others during PLMCE and Drizzle Day 2012 finally clarified my understanding of Drizzle as a “Lightweight SQL Database for Cloud Infrastructure and Web Applications”: Drizzle is the first multi-scale relational database server. That’s a big claim and surely many people will argue with me, so here are my reasons.
First, as Mårten said, “In the cloud you must scale both up and out.” Why do this? Because as he also said, “the whole world is going online”. To meet increasing demand, database servers will have to scale up: use modern hardware and more of it. In other words: more cores, more RAM, faster storage, and a lot more threads. In Mark Callaghan’s keynote he talked about how InnoDB is still unable to realize the full potential of modern hardware. Drizzle is supposedly designed to avoid these issues because it’s optimized for modern hardware, but this has yet to be proven. Drizzle really needs independent benchmarks (need #6).
On the scale-out side, Drizzle replication has the potential to scale in ways MySQL can’t. Already it supports multi-master replication, and the fact that Drizzle replication is pluggable will open the doors to innovation. Although strong already, replication is an area where Drizzle should keep focusing efforts.
Third, Drizzle directly addresses a key aspect of the cloud paradigm: multi-tenancy. In the cloud paradigm, servers contain and serve many isolated services for many different customers. Using virtualization is one way to do this, or running multiple instances of a program, but those methods have known drawbacks. Drizzle multi-tenancy, which should be the star of the 7.2 release, directly satisfies this key aspect of the cloud paradigm, thereby making Drizzle a true database for the cloud.
So, Drizzle should work well on modern hardware, and its replication system is robust and flexible, and it will have native multi-tenancy. For these reason, I contend that Drizzle is the first multi-scale relational database server. Realizing this has helped me to stop thinking in old client-server MySQL terms and begin thinking in new cloud Drizzle terms. Granted, these considerations really only apply to the web and web applications; a company can still benefit the most from scale-up with MySQL in-house, but when it comes to Drizzle, serving web apps is the goal which requires multi-scale because, yes, the whole world is going online.
Another significant realization I had while talking with Patrick Crews was that Drizzle adapts to an environment rather than forcing the environment to adapt to it. Why does this matter? Again, Mårten Mickos’s spoke about how older database servers (he didn’t name names) didn’t adapt as quickly as MySQL, therefore MySQL has lead the way in cloud infrastructures. I agree, but MySQL has its own rigidities, namely: all its subsystems. For example, MySQL replication is what it is, and there’s no easy way to extend or change it. It took until MySQL 5.6 before global transaction IDs became reality, whereas Drizzle has had global transaction IDs since its first GA, 7.0. New query logging in MySQL? Forget about it; but in Drizzle it’s trivial. If those old database servers are whales, and MySQL is a dolphin, then Drizzle is a marlin–even faster than a dolphin.
Does Drizzle’s adaptability really matter? I think it does because another common prediction I heard at PLMCE was that cloud infrastructure standards are in their infancy today, so there are a few competing and incompatible ones. Mark Atwood talked about as much during his BoF. What will “the” cloud require in the future? Nobody knows, but it should be easy to write a Drizzle plugin to meet the requirements.
Finally, not to keep quoting Mårten but he just happened to say a lot that stuck with me, he noted how a database server takes 10 years to develop. I agree, and so did Brian in a tweet, noting also that Drizzle has a head start. In this respect, I should perhaps apologize again to the core Drizzle developers for constantly enough nagging the project about various things. I say “perhaps” because I think it’s fair to say that I contribute as much as I complain. In any case, I realize now that whereas I would like Drizzle to somehow explode onto the scene and be the talk of PLMCE 2013, I know that can’t happen. I’m not sure who said it in their keynote (probably Mårten), but they said that jumping into new technology was not good and not something people do; rather, people wade into new technology. Furthermore, the expo hall at PLMCE 2012 was really lively and there were a number of companies with MySQL or NoSQL or something-SQL servers I had never heard of. So I realized that it will be years more before Drizzle has even a fraction of the prominence that MySQL has, and it took nearly a decade (or more) for MySQL to become what it is today. That just gives us some breathing room though: time to fix the bugs and code the features that we know will make Drizzle the multi-scale relational database server of choice.
Drizzle 7.1 has been released. In my frank opinion, there’s no longer any reason to use 7.0; everyone should upgrade to 7.1 because it is far superior. Read the release announce for the list of new and fixed things. Most importantly, imho: sweeping updates to docs.drizzle.org. This is also good news because in a few days I’m giving a presentation, Getting Started with Drizzle 7.1.
Thanks to all the people who worked on Drizzle in addition to their day jobs, their families, and their personal lives.