Saturday, February 23, 2008

Quick and Dirty Database Pooling in Django and MySQL using SQLAlchemy

As our Facebook application "Just Three Words" started to get rolling, we started to optimize our database calls.

The first step was to take a look at the ORM db calls taking the most time. Django has a facility to do this, by taking a look at the raw SQL Django is running. Once we did this and replaced the obviously slow calls with custom and optimized SQL, I started to look into some sort of database connection pooling for Django.

A friend suggested I look into, among other things, SQLAlchemy. There's a very good database connection pooling piece to SQLAlchemy (docs here), and after thinking about ways to write my own db pooling code based on routines I'd researched all over the web, I figured it was quicker and easier to use what was already out there. I could get pooling in place and buy myself some time to write a customized db pooling routine, thus not only getting what I specifically needed but also learning how exactly to write a db pooling routine!

(Of course, I haven't done that yet - part of the price of quick and dirty code that works is that the pressure is off to actually create a routine that isn't q&d).

We're using MySql, so I opened up django.db.backends.mysql.base.py and applied the code from the SQLAlchemy docs to base.py:


import sqlalchemy.pool as pool
from django.db.backends import BaseDatabaseWrapper, BaseDatabaseFeatures, BaseDatabaseOperations, util
try:
import MySQLdb as Database
Database = pool.manage(Database)
except ImportError, e:
from django.core.exceptions import ImproperlyConfigured
raise ImproperlyConfigured("Error loading MySQLdb module: %s" % e)

This wasn't enough, however, as SQLAlchemy didn't like the use of **kwargs at the time the connection is created. So I changed this:

self.connection = Database.connect(**kwargs)

to this:

self.connection = Database.connect(user=kwargs['user'], db=kwargs['db'], passwd=kwargs['passwd'], charset='utf8')


and voila! Quick and dirty DB pooling hack in Django for MySql implementations!

There are problems with this, of course, the most obvious being that I'm now branched from the standard Django trunk, that this isn't a universal solution & if taken to other database implementations violates DRY, etc. But: as a quick stopgap, and as a solution to an immediate need to speed up the user experience and keep them happy, it's pretty good.

(Facebook has an 8-second timeout for calls to your server. If 8 seconds passes and your server is still ruminating the nature of it's navel, the users see the White Screen Of Death, and they drop your app like it was an unpleasantly hot rock)

As usual, post your comments, criticisms, and cliches, and feel free to tell me UR DOIN IT ALL WRONG & offer suggestions if that's what makes you happy.

3 comments:

  1. Hi! This looks like exactly the quick-hack I need to speed up django access to a remote database. BUT in practice, it doesn't apper to work for me - as in, it operates, but the connection is still around 100 times slower than to a local db. You howto doesn't /quite/ work - those imports don't appear to be necessary and I was wondering what version of django/alchemy etc. you were were using? Thx,

    ReplyDelete
  2. I'm using the latest trunk of Django, and the latest stable release of SQLAlchemy ( I have to go look, but it is the most recent release).

    My db is local; I haven't tested in a remote db, but pooling should still reduce the overhead of the individual connection creations.. I'll be able to respond in depth after I'm back form GSP...

    ReplyDelete
  3. Django has a nice way to patch in custom database backends. DATABASE_ENGINE can be the full path to any class that acts like the standard database backends do. So basically, copy backends/mysql to somewhere else (like under your django app), make your changes there, and then point your DATABASE_ENGINE setting at it.

    This hack works great for me with the latest version of Django and sqlalchemy 0.45.

    ReplyDelete