Why I Love Drupal’s Caching API

So I was re-reading Jeff Eaton’s excellent article on Drupal’s method of caching.  I think they way the Drupal project has elected to handle caching is a model that other CMS’s, especially those that expect to be used for a a variety of purposes, could really learn from.  Briefly, caching refers to the temporary storage and reuse of data to increase the speed of an application or site.  It is most useful in situations where the cost of producing the data is relatively high, and where there is a relatively frequent need for the data.

Almost all CMS’s have a caching layer.  Expression Engine and WordPress (the two I work with daily) have theirs.  Here’s a summary:

EE:

  • Built-in caching behavior for both templates and template tags.  Available query caching.
  • One major DB caching module (Solspace morsels)
  • No caching abstraction or API

WordPress:

  • Caches objects and other stuff in mysql.  Minimally used by default.  Does not cache entire pages or parts of pages by default.
  • File caching can be enabled through a configuration change.
  • Various powerful plugins are avaialable that implement page caching.

Both of these approaches are reasonable, and work well enough to enable high traffic sites to run properly on these platforms.  They also don’t require you to think that hard while developing themes or templates for either WP or EE, since you can apply caching later.  but there is a certain lack of flexibility.  Essentially, there’s a default behavior, and you then rely on various configurations or plugins to improve that default behavior.  Let’s contrast that with Drupal:

Drupal has a caching API — a single set of functions for setting to, getting from and destroying cache that are then used across the entire CMS.  Two good things about this:  1.  you can think avtively about setting and getting cache in the normal course of developing modules and have an expectation of unified, predictable behavior whenever you invoke cache.  2.  the caching system is an API, abstracted from the details of what your application actually does, and you can change the behavior of all of your site’s caching by simply replacing the code that implements the cache.  For example, you could decide to do your caching using memcached, or through a DB, or using file based caching.  These three implementations are well known and immediately available (I think they even come with the standard Drupal 6 install.)  Says Jeff Eaton:

If you’re really hoping to squeeze the most out of your server, Drupal also supports the use of alternative caching systems. By changing a single line in your site’s settings.php file, you can point it to different implementations of the standard cache_set(), cache_get(), and cache_clear_all() functions. File-based caching, integration with the open source memcached project, and other approaches are all possible. As long as you’ve used the standard Drupal caching functions, your module’s code won’t have to be altered.

Don’t get me wrong — different caching behaviors are available in WP (and maybe even EE) as well, the problem is just that to fundamentally change cache behavior requires hacking.  Drupal intentionally throws open a world of caching choices by completely abstracting its main caching mechanism.

EECI2009

Image: Leslie Camacho of Ellis Labs announces the upcoming release of EE 2.0.

I was lucky enough to attend EECI2009 in the Netherlands last week. Overall, the strongest EE/CI conference I’ve attended so far, which is perhaps a good sign for the growing community around these products. It was an interesting thing to have both EE and CI people in the same building. I hadn’t thought about it all that much, but these two communities are very different, and it was sort of amusing to see all of the EE people (designer-ish, stylish, expensive bags) mix with the CI people (nerdy, large brains, beer, blunt.)

There are also a growing number of EE developers who extend and modify that platform. Chief of this tribe is no doubt Solspace‘s Paul Burdick. He gave an excellent presentation on add-on development which clarified a number of issues about the future of this important activity for EE. Here, in PDF form, are the slides. Paul is responsible for a number of major add-ons, including the tag module. He is one of the original developers of EE itself, having been the main dev at Ellis Labs for quite a while. The other major emerging EE add-on developers were at the conference as well. These folks include Brandon Kelly and Leevi Graham. In true Australian form, Levy wore a outback-looking hat for the duration of the conference. (My apologies Leevi if it was in fact some other sort of hat.)

The big news at the conference was that EE 2.0, the new codeigniter-based version of the product will be released on December 1, 2009. There was much rejoicing when this long-delayed announcement was made. Now, after the rejoicing, I’m sure that there’s quite a bit of hand-wringing/scrambling going on across the community. Many people who rely on EE for their business (Grist included) are now trying to figure out if/when/how to integrate this new development into our work. Part of what I heard from the likes of Paul made this easier to think about — it’s now clear that a) EE1.6 will continue to be supported for quite a while yet and b) many important add-ons for EE may not be ready for EE2 any time very soon. So there’s certainly no pressure to upgrade, and plenty of reason to wait, play with the new product and see what happens. We also heard about a planned EE 2.1 release, which sounds like a more stable basis for an enterprise-upgrade.

EE Performance Guidelines from Paul Burdick at Solspace

Paul Burdick at Solspace has written a totally awesome document about EE Performance Guidelines. With his permission, I’m sharing it. Please contact Paul or Solspace directly for updates or a site evaluation.

While I’m on the subject, here’s the two add-ons currently in Solspace’s new performance suite. This includes Morsels — a new DB-based acceleration method, and Page Caching, a method to cache entire pages as flat files.

Expression Engine and Performance

Tomorrow I’m leading a session at the Expression Engine Road Show here in Seattle about EE and high traffic sites. Here’s a link to the slides from that presentation for those who are interested.

EEandPerformance.ppt (large > 2M)

This covers topics like: general site performance best practices, EE caching methods, applying EE caching and then gives sort of a miscellany of considerations one needs to think about when using EE in a high traffic environment.

Recapthca for Expression Engine Member Registration

Well, due to some recent drama involving blog spam at Grist, I had the opportunity to cook up an ExpressionEngine extension that implements reCaptcha for EE signup.

EE provides a convenient hook for overriding the (rather weak) native EE captcha, so getting the captcha to appear on the signup form is simply an exercise in taking the convenient reCaptcha PHP library and invoking its get_html method at the appropriate time. The resulting HTML simply overrides the result of the usual captcha method from EE. Here are the relevant lines from the extension:

require_once($PREFS->ini(‘system_folder_path’).’extensions
/mp_recaptcharecaptchalib.php’);
$EXT->end_script = TRUE;
return recaptcha_get_html($PREFS->ini(‘recaptcha_public_key’));

Now the captcha appears on ths signup form, and it’s time to turn our attention to processing. There are three requirements: 1) we need to invoke a check of the reCaptcha at the appropriate moment 2) we need to cleanly pass an error back to the signup process on failure and 3) we need to override or at least mask the native captcha check. Here’s my approach:

EE provides the member_member_register_start hook at the beginning of the registration routine. At that point it’s quite easy to do the reCaptcha check:

require_once($PREFS->ini(‘system_folder_path’).’extensions/mp_recaptcha/recaptchalib.php’);
$resp = recaptcha_check_answer ($reCaptchaPrivateKey,
$IN->IP,
$_POST["recaptcha_challenge_field"],
$_POST["recaptcha_response_field"]);

But how to handle the response? member_member_register_start only allows injection of logic into the registration process (ie — you can’t affect the return value of the method in which it appears.) You can, however, affect the session and any globals. So here’s the trick I used.

In the registration form, I added:

<input type=”hidden” name=”captcha” value=”1″>

And in the method involked when the captcha is created, I did the following:

$DB->query(“INSERT INTO exp_captcha (date, ip_address, word) VALUES (UNIX_TIMESTAMP(), ‘”.$IN->IP.”‘, ’1′)”);

This means that EE will always be expecting a captcha response of “1″, and will always get it UNLESS some outside force intervenes. This is where the result of the reCaptcha web service check comes in. If the result is successfull, we do nothing, and allow EE to think its native captcha check went perfectly. If the result indicates failure, then I do the following in the method invoked at member_member_register_start:

$_POST['captcha'] = ”;

This little change will cause EE’s native captcha check to think that it has failed, and produce its normal errors upon a captcha failure.

I’d be happy to provide the entire extension to anyone who is interested, but I feel like it needs a little cleaning, documenting and generalization in order to stand on its own two feet. Perhaps I’ll post it here soon. Until then, let me know if you’d like a copy.