Introduction

Should you use it?

Django-cachalot is the perfect speedup tool for most Django projects. It will speedup a website of 100 000 visits per month without any problem. In fact, the more visitors you have, the faster the website becomes. That’s because every possible SQL query on the project ends up being cached.

Django-cachalot is especially efficient in the Django administration website since it’s unfortunately badly optimised (use foreign keys in list_editable if you need to be convinced). One of the best suited is select_related and prefetch_related operations.

However, it’s not suited for projects where there is a high number of modifications per minute on each table, like a social network with more than a 50 messages per minute. Django-cachalot may still give a small speedup in such cases, but it may also slow things a bit (in the worst case scenario, a 20% slowdown, according to the benchmark). If you have a website like that, optimising your SQL database and queries is the number one thing you have to do.

There is also an obvious case where you don’t need django-cachalot: when the project is already fast enough (all pages load in less than 300 ms). Like any other dependency, django-cachalot is a potential source of problems (even though it’s currently bug free). Don’t use dependencies you can avoid, a “future you” may thank you for that.

Features

  • Saves in cache the results of any SQL query generated by the Django ORM that reads data. These saved results are then returned instead of executing the same SQL query, which is faster.
  • The first time a query is executed is about 10% slower, then the following times are way faster (7× faster being the average).
  • Automatically invalidates saved results, so that you never get stale results.
  • Invalidates per table, not per object: if you change an object, all the queries done on other objects of the same model are also invalidated. This is unfortunately technically impossible to make a reliable per-object cache. Don’t be fooled by packages pretending having that per-object feature, they are unreliable and dangerous for your data.
  • Handles everything in the ORM. You can use the most advanced features from the ORM without a single issue, django-cachalot is extremely robust.
  • An easy control thanks to Settings and a simple API. But that’s only required if you have a complex infrastructure. Most people will never use settings or the API.
  • A few bonus features like a signal triggered at each database change (including bulk changes) and a template tag for a better template fragment caching.

Comparison with similar tools

This comparison was done in December 2015. It compares django-cachalot to the other popular automatic ORM caches at the moment: django-cache-machine & django-cacheops.

Features

Feature cachalot cache-machine cacheops
Easy to install quite
Cache agnostic
Type of invalidation per table per object per query
CPU performance excellent excellent excellent
Memory performance excellent good excellent
Reliable
Useful for > 50 modifications per minute
Handles transactions
Handles Django admin save
Handles multi-table inheritance
Handles QuerySet.count
Handles QuerySet.aggregate/annotate
Handles QuerySet.update
Handles QuerySet.select_related
Handles QuerySet.extra
Handles QuerySet.values/values_list
Handles QuerySet.dates/datetimes
Handles subqueries
Handles querysets generating a SQL HAVING keyword
Handles cursor.execute
Handles the Django command flush

Explanations

“Handles [a feature]” means that the package correctly invalidates SQL queries using that feature. So if a package doesn’t handle a feature, you may get stale query results when using this feature. It does not mean that it caches a query with this feature, although django-cachalot caches all queries except random queries or those ran through cursor.execute.

This comparison was done by running the test suite of cachalot against cache-machine & cacheops. This test suite is indeed relevant for other packages (such as cache-machine & cacheops) since most of it is written in a cachalot-independent way.

Similarly, the performance comparison was done using our benchmark, coupled with a memory measure.

To me, cache-machine & cacheops are not reliable because of these reasons:

  • Neither cache-machine or cacheops handle transactions, which is critical. Transactions are used a lot in Django internals: at least in any Django admin save, many-to-many relations modification, bulk creation or update, migrations, session save. If an error occurs during one of these operations, good luck finding if stale data is returned. The best you can do in this case is manually clearing the cache.
  • If you use a query that’s not handled, you may get stale data. It ends up ruining your database since it lets you save modifications to stale data, therefore overwriting the latest version that’s in the database. And you always end up using queries that are not handled since there is no list of unhandled queries in the documentation of each module.
  • In the case of cache-machine, another issue is that it relies on “flush lists”, which can’t work reliably when implemented in a cache like this (see cache-machine#107).

Number of lines of code

Django-cachalot tries to be as minimalist as possible, while handling most use cases. Being minimalist is essential to create maintainable projects, and having a large test suite is essential to get an excellent quality. The statistics below speak for themselves…

Project part cachalot cache-machine cacheops
Application 743 843 1662
Tests 3023 659 1491