django-cachalot

Caches your Django ORM queries and automatically invalidates them.

https://raw.github.com/BertrandBordage/django-cachalot/master/django-cachalot.jpg
http://img.shields.io/pypi/v/django-cachalot.svg?style=flat-square&maxAge=3600 http://img.shields.io/travis/BertrandBordage/django-cachalot/master.svg?style=flat-square&maxAge=3600 http://img.shields.io/coveralls/BertrandBordage/django-cachalot/master.svg?style=flat-square&maxAge=3600 http://img.shields.io/scrutinizer/g/BertrandBordage/django-cachalot/master.svg?style=flat-square&maxAge=3600 https://img.shields.io/gitter/room/django-cachalot/Lobby.svg?style=flat-square&maxAge=3600

Introduction

Should you use it?

Django-cachalot is the perfect speedup tool for most Django projects. It will speedup a website of 100 000 visits per month without any problem. In fact, the more visitors you have, the faster the website becomes. That’s because every possible SQL query on the project ends up being cached.

Django-cachalot is especially efficient in the Django administration website since it’s unfortunately badly optimised (use foreign keys in list_editable if you need to be convinced).

However, it’s not suited for projects where there is a high number of modifications per minute on each table, like a social network with more than a 50 messages per minute. Django-cachalot may still give a small speedup in such cases, but it may also slow things a bit (in the worst case scenario, a 20% slowdown, according to the benchmark). If you have a website like that, optimising your SQL database and queries is the number one thing you have to do.

There is also an obvious case where you don’t need django-cachalot: when the project is already fast enough (all pages load in less than 300 ms). Like any other dependency, django-cachalot is a potential source of problems (even though it’s currently bug free). Don’t use dependencies you can avoid, a “future you” may thank you for that.

Features

  • Saves in cache the results of any SQL query generated by the Django ORM that reads data. These saved results are then returned instead of executing the same SQL query, which is faster.
  • The first time a query is executed is about 10% slower, then the following times are way faster (7× faster being the average).
  • Automatically invalidates saved results, so that you never get stale results.
  • Invalidates per table, not per object: if you change an object, all the queries done on other objects of the same model are also invalidated. This is unfortunately technically impossible to make a reliable per-object cache. Don’t be fooled by packages pretending having that per-object feature, they are unreliable and dangerous for your data.
  • Handles everything in the ORM. You can use the most advanced features from the ORM without a single issue, django-cachalot is extremely robust.
  • An easy control thanks to Settings and a simple API. But that’s only required if you have a complex infrastructure. Most people will never use settings or the API.
  • A few bonus features like a signal triggered at each database change (including bulk changes) and a template tag for a better template fragment caching.

Comparison with similar tools

This comparison was done in December 2015. It compares django-cachalot to the other popular automatic ORM caches at the moment: django-cache-machine & django-cacheops.

Features

Feature cachalot cache-machine cacheops
Easy to install quite
Cache agnostic
Type of invalidation per table per object per query
CPU performance excellent excellent excellent
Memory performance excellent good excellent
Reliable
Useful for > 50 modifications per minute
Handles transactions
Handles Django admin save
Handles multi-table inheritance
Handles QuerySet.count
Handles QuerySet.aggregate/annotate
Handles QuerySet.update
Handles QuerySet.select_related
Handles QuerySet.extra
Handles QuerySet.values/values_list
Handles QuerySet.dates/datetimes
Handles subqueries
Handles querysets generating a SQL HAVING keyword
Handles cursor.execute
Handles the Django command flush
Explanations

“Handles [a feature]” means that the package correctly invalidates SQL queries using that feature. So if a package doesn’t handle a feature, you may get stale query results when using this feature. It does not mean that it caches a query with this feature, although django-cachalot caches all queries except random queries or those ran through cursor.execute.

This comparison was done by running the test suite of cachalot against cache-machine & cacheops. This test suite is indeed relevant for other packages (such as cache-machine & cacheops) since most of it is written in a cachalot-independent way.

Similarly, the performance comparison was done using our benchmark, coupled with a memory measure.

To me, cache-machine & cacheops are not reliable because of these reasons:

  • Neither cache-machine or cacheops handle transactions, which is critical. Transactions are used a lot in Django internals: at least in any Django admin save, many-to-many relations modification, bulk creation or update, migrations, session save. If an error occurs during one of these operations, good luck finding if stale data is returned. The best you can do in this case is manually clearing the cache.
  • If you use a query that’s not handled, you may get stale data. It ends up ruining your database since it lets you save modifications to stale data, therefore overwriting the latest version that’s in the database. And you always end up using queries that are not handled since there is no list of unhandled queries in the documentation of each module.
  • In the case of cache-machine, another issue is that it relies on “flush lists”, which can’t work reliably when implemented in a cache like this (see cache-machine#107).

Number of lines of code

Django-cachalot tries to be as minimalist as possible, while handling most use cases. Being minimalist is essential to create maintainable projects, and having a large test suite is essential to get an excellent quality. The statistics below speak for themselves…

Project part cachalot cache-machine cacheops
Application 743 843 1662
Tests 3023 659 1491

Quick start

Requirements

  • Django 1.11 or 2.0
  • Python 2.7, 3.4, 3.5 or 3.6
  • a cache configured as 'default' with one of these backends:
  • one of these databases:
    • PostgreSQL
    • SQLite
    • MySQL (but on older versions like MySQL 5.5, django-cachalot has no effect, see MySQL limits)

Usage

  1. pip install django-cachalot
  2. Add 'cachalot', to your INSTALLED_APPS
  3. If you use multiple servers with a common cache server, double check their clock synchronisation
  4. If you modify data outside Django – typically after restoring a SQL database –, use the manage.py command
  5. Be aware of the few other limits
  6. If you use django-debug-toolbar, you can add 'cachalot.panels.CachalotPanel', to your DEBUG_TOOLBAR_PANELS
  7. Enjoy!

Settings

CACHALOT_ENABLED

Default:True
Description:If set to False, disables SQL caching but keeps invalidating to avoid stale cache.

CACHALOT_CACHE

Default:

'default'

Description:

Alias of the cache from CACHES used by django-cachalot.

Warning

After modifying this setting, you should invalidate the cache using the manage.py command or the API. Indeed, only the cache configured using this setting is automatically invalidated by django-cachalot – for optimisation reasons. So when you change this setting, you end up on a cache that may contain stale data.

CACHALOT_DATABASES

Default:'supported_only'
Description:List, tuple, set or frozenset of database aliases from DATABASES against which django-cachalot will do caching. By default, the special value 'supported_only' enables django-cachalot only on supported database engines.

CACHALOT_TIMEOUT

Default:

None

Description:

Number of seconds during which the cache should consider data as valid. None means an infinite timeout.

Warning

Cache timeouts don’t work in a strict way on most cache backends. A cache might not keep data during the requested timeout: it can keep it in memory during a shorter time than the specified timeout. It can even keep it longer, even if data is not returned when you request it. So don’t rely on timeouts to limit the size of your database, you might face some unexpected behaviour. Always set the maximum cache size instead.

CACHALOT_CACHE_RANDOM

Default:False
Description:If set to True, caches random queries (those with order_by('?')).

CACHALOT_INVALIDATE_RAW

Default:True
Description:If set to False, disables automatic invalidation on raw SQL queries – read raw queries limits for more info.

CACHALOT_ONLY_CACHABLE_TABLES

Default:frozenset()
Description:Sequence of SQL table names that will be the only ones django-cachalot will cache. Only queries with a subset of these tables will be cached. The sequence being empty (as it is by default) doesn’t mean that no table can be cached: it disables this setting, so any table can be cached. CACHALOT_UNCACHABLE_TABLES has more weight than this: if you add a table to both settings, it will never be cached. Run ./manage.py invalidate_cachalot after changing this setting.

CACHALOT_UNCACHABLE_TABLES

Default:frozenset(('django_migrations',))
Description:Sequence of SQL table names that will be ignored by django-cachalot. Queries using a table mentioned in this setting will not be cached. Always keep 'django_migrations' in it, otherwise you may face some issues, especially during tests. Run ./manage.py invalidate_cachalot after changing this setting.

CACHALOT_QUERY_KEYGEN

Default:'cachalot.utils.get_query_cache_key'
Description:Python module path to the function that will be used to generate the cache key of a SQL query. Run ./manage.py invalidate_cachalot after changing this setting.

CACHALOT_TABLE_KEYGEN

Default:'cachalot.utils.get_table_cache_key'
Description:Python module path to the function that will be used to generate the cache key of a SQL table. Clear your cache after changing this setting (it’s not enough to use ./manage.py invalidate_cachalot).

manage.py command

manage.py invalidate_cachalot is available to invalidate all the cache keys set by django-cachalot. If you run it without any argument, it invalidates all models on all caches and all databases. But you can specify what applications or models are invalidated, and on which cache or database.

Examples:

./manage.py invalidate_cachalot auth
Invalidates all models from the ‘auth’ application.
./manage.py invalidate_cachalot your_app auth.User
Invalidates all models from the ‘your_app’ application, but also the User model from the ‘auth’ application.
./manage.py invalidate_cachalot -c redis -p postgresql
Invalidates all models, but only for the database configured with the ‘postgresql’ alias, and only for the cache configured with the ‘redis’ alias.

Template utils

Caching template fragments can be extremely powerful to speedup a Django application. However, it often means you have to adapt your models to get a relevant cache key, typically by adding a timestamp that refers to the last modification of the object.

But modifying your models and caching template fragments leads to stale contents most of the time. There’s a simple reason to that: we rarely only display the data from one model, we often want to display related data, such as the number of books written by someone, display a quote from a book of this author, display similar authors, etc. In such situations, it’s impossible to cache template fragments and avoid stale rendered data.

Fortunately, django-cachalot provides an easy way to fix this issue, by simply checking when was the last time data changed in the given models or tables. The API function get_last_invalidation does that, and we provided a get_last_invalidation template tag to directly use it in templates. It works exactly the same as the API function.

Django template tag

Example of a quite heavy nested loop with a lot of SQL queries (considering no prefetch has been done):

{% load cachalot cache %}

{% get_last_invalidation 'auth.User' 'library.Book' 'library.Author' as last_invalidation %}
{% cache 3600 short_user_profile last_invalidation %}
  {{ user }} has borrowed these books:
  {% for book in user.borrowed_books.all %}
    <div class="book">
      {{ book }} ({{ book.pages.count }} pages)
      <span class="authors">
        {% for author in book.authors.all %}
          {{ author }}{% if not forloop.last %},{% endif %}
        {% endfor %}
      </span>
    </div>
  {% endfor %}
{% endcache %}

cache_alias and db_alias keywords arguments of this template tag are also available (see cachalot.api.get_last_invalidation()).

Jinja2 statement and function

A Jinja2 extension for django-cachalot can be used, simply add 'cachalot.jinja2ext.cachalot', to the 'extensions' list of the OPTIONS dict in the Django TEMPLATES settings.

It provides:

  • The API function get_last_invalidation directly available as a function anywhere in Jinja2.
  • An Jinja2 statement equivalent to the cache template tag of Django.

The cache does the same thing as its Django template equivalent, except that cache_key and timeout are optional keyword arguments, and you need to add commas between arguments. When unspecified, cache_key is generated from the template filename plus the statement line number, and timeout defaults to infinite. To specify which cache should store the saved content, use the cache_alias keyword argument.

Same example than above, but for Jinja2:

{% cache get_last_invalidation('auth.User', 'library.Book', 'library.Author'),
         cache_key='short_user_profile', timeout=3600 %}
  {{ user }} has borrowed these books:
  {% for book in user.borrowed_books.all() %}
    <div class="book">
      {{ book }} ({{ book.pages.count() }} pages)
      <span class="authors">
        {% for author in book.authors.all() %}
          {{ author }}{% if not loop.last %},{% endif %}
        {% endfor %}
      </span>
    </div>
  {% endfor %}
{% endcache %}

Signal

cachalot.signals.post_invalidation is available if you need to do something just after a cache invalidation (when you modify something in a SQL table). sender is the name of the SQL table invalidated, and a keyword argument db_alias explains which database is affected by the invalidation. Be careful when you specify sender, as it is sensible to string type. To be sure, use Model._meta.db_table.

This signal is not directly triggered during transactions, it waits until the current transaction ends. This signal is also triggered when invalidating using the API or the manage.py command. Be careful when using multiple databases, if you invalidate all databases by simply calling invalidate(), this signal will be triggered one time for each database and for each model. If you have 3 databases and 20 models, invalidate() will trigger the signal 60 times.

Example:

from cachalot.signals import post_invalidation
from django.dispatch import receiver
from django.core.mail import mail_admins
from django.contrib.auth import *

# This prints a message to the console after each table invalidation
def invalidation_debug(sender, **kwargs):
    db_alias = kwargs['db_alias']
    print('%s was invalidated in the DB configured as %s'
          % (sender, db_alias))

post_invalidation.connect(invalidation_debug)

# Using the `receiver` decorator is just a nicer way
# to write the same thing as `signal.connect`.
# Here we specify `sender` so that the function is executed only if
# the table invalidated is the one specified.
# We also connect it several times to be executed for several senders.
@receiver(post_invalidation, sender=User.groups.through._meta.db_table)
@receiver(post_invalidation, sender=User.user_permissions.through._meta.db_table)
@receiver(post_invalidation, sender=Group.permissions.through._meta.db_table)
def warn_admin(sender, **kwargs):
    mail_admins('User permissions changed',
                'Someone probably gained or lost Django permissions.')

Limits

High rate of database modifications

Do not use django-cachalot if your project has more than 50 database modifications per second on most of its tables. There will be no problem, but django-cachalot will become inefficient and will end up slowing your project instead of speeding it. Read the introduction for more details.

Redis

By default, Redis will not evict persistent cache keys (those with a None timeout) when the maximum memory has been reached. The cache keys created by django-cachalot are persistent by default, so if Redis runs out of memory, django-cachalot and all other cache.set will raise ResponseError: OOM command not allowed when used memory > 'maxmemory'. because Redis is not allowed to delete persistent keys.

To avoid this, 2 solutions:

  • If you only store disposable data in Redis, you can change maxmemory-policy to allkeys-lru in your Redis configuration. Be aware that this setting is global; all your Redis databases will use it. If you don’t know what you’re doing, use the next solution or use another cache backend.
  • Increase maxmemory in your Redis configuration. You can start by setting it to a high value (for example half of your RAM) then decrease it by looking at the Redis database maximum size using redis-cli info memory.

For more information, read Using Redis as a LRU cache.

Memcached

By default, memcached is configured for small servers. The maximum amount of memory used by memcached is 64 MB, and the maximum memory per cache key is 1 MB. This latter limit can lead to weird unhandled exceptions such as Error: error 37 from memcached_set: SUCCESS if you execute queries returning more than 1 MB of data.

To increase these limits, set the -I and -m arguments when starting memcached. If you use Ubuntu and installed the package, you can modify /etc/memcached.conf, add -I 10m on a newline to set the limit per cache key to 10 MB, and if you want increase the already existing -m 64 to something like -m 1000 to set the maximum cache size to 1 GB.

Locmem

Locmem is a just a dict stored in a single Python process. It’s not shared between processes, so don’t use locmem with django-cachalot in a multi-processes project, if you use RQ or Celery for instance.

Filebased

Filebased, a simple persistent cache implemented in Django, has a small bug (#25501): it cannot cache some objects, like psycopg2 ranges. If you use range fields from django.contrib.postgres and your Django version is affected by this bug, you need to add the tables using range fields to CACHALOT_UNCACHABLE_TABLES.

MySQL

This database software already provides by default something like django-cachalot: MySQL query cache. Unfortunately, this built-in query cache has no significant effect since at least MySQL 5.7. However, in MySQL 5.5 it was working so well that django-cachalot was not improving performance. So depending on the MySQL version, django-cachalot may be useless. See the current django-cachalot benchmark and compare it with an older run of the same benchmark to see the clear difference: MySQL became 4 × slower since then!

Raw SQL queries

Note

Don’t worry if you don’t understand what follow. That probably means you don’t use raw queries, and therefore are not directly concerned by those potential issues.

By default, django-cachalot tries to invalidate its cache after a raw query. It detects if the raw query contains UPDATE, INSERT, DELETE, ALTER, CREATE or DROP and then invalidates the tables contained in that query by comparing with models registered by Django.

This is quite robust, so if a query is not invalidated automatically by this system, please send a bug report. In the meantime, you can use the API to manually invalidate the tables where data has changed.

However, this simple system can be too efficient in some very rare cases and lead to unwanted extra invalidations.

Multiple servers clock synchronisation

Django-cachalot relies on the computer clock to handle invalidation. If you deploy the same Django project on multiple machines, but with a centralised cache server, all the machines serving Django need to have their clocks as synchronised as possible. Otherwise, invalidations will happen with a latency from one server to another. A difference of even a few seconds can be harmful, so double check this!

To get a rough idea of the clock synchronisation of two servers, simply run python -c 'import time; print(time.time())' on both servers at the same time. This will give you a number of seconds, and it should be almost the same, with a difference inferior to 1 second. This number is independent of the time zone.

To keep your clocks synchronised, use the Network Time Protocol.

Replication server

If you use multiple databases where at least one is a replica of another, django-cachalot has no way to know that the replica is modified automatically, since it happens outside Django. The SQL queries cached for the replica will therefore not be invalidated, and you will see some stale queries results.

To fix this problem, you need to tell django-cachalot to also invalidate the replica when the primary database is invalidated. Suppose your primary database has the 'default' database alias in DATABASES, and your replica has the 'replica' alias. Use the signal and cachalot.api.invalidate() this way:

from cachalot.api import invalidate
from cachalot.signals import post_invalidation
from django.dispatch import receiver

@receiver(post_invalidation)
def invalidate_replica(sender, **kwargs):
    if kwargs['db_alias'] == 'default':
        invalidate(sender, db_alias='replica')

Multiple cache servers for the same database

On large projects, we often end up having multiple Django servers on several physical machines. For performance reasons, we generally decide to have a cache per server, while the database stays on a single server. But the problem with django-cachalot is that it only invalidates the cache configured using CACHALOT_CACHE. So all caches end up serving stale data.

To avoid this, you need inside each Django server to be able to communicate with the rest of the servers in order to invalidate other caches when an invalidation occurs. If this is not possible in your situation, you must not use django-cachalot. But if you can, each Django server must also have all other caches in the CACHES setting. Then, you need to manually invalidate all other caches when an invalidation occurs. Add this to a models.py file of an installed application:

import threading

from cachalot.api import invalidate
from cachalot.signals import post_invalidation
from django.dispatch import receiver
from django.conf import settings

SIGNAL_INFO = threading.local()

@receiver(post_invalidation)
def invalidate_other_caches(sender, **kwargs):
    if getattr(SIGNAL_INFO, 'was_called', False):
        return
    db_alias = kwargs['db_alias']
    for cache_alias in settings.CACHES:
        if cache_alias == settings.CACHALOT_CACHE:
            continue
        SIGNAL_INFO.was_called = True
        try:
            invalidate(sender, db_alias=db_alias, cache_alias=cache_alias)
        finally:
            SIGNAL_INFO.was_called = False

API

Use these tools to interact with django-cachalot, especially if you face raw queries limits or if you need to create a cache key from the last table invalidation timestamp.

cachalot.api.invalidate(*tables_or_models, **kwargs)[source]

Clears what was cached by django-cachalot implying one or more SQL tables or models from tables_or_models. If tables_or_models is not specified, all tables found in the database (including those outside Django) are invalidated.

If cache_alias is specified, it only clears the SQL queries stored on this cache, otherwise queries from all caches are cleared.

If db_alias is specified, it only clears the SQL queries executed on this database, otherwise queries from all databases are cleared.

Parameters:
  • tables_or_models (tuple of strings or models) – SQL tables names, models or models lookups (or a combination)
  • cache_alias (string or NoneType) – Alias from the Django CACHES setting
  • db_alias (string or NoneType) – Alias from the Django DATABASES setting
Returns:

Nothing

Return type:

NoneType

cachalot.api.get_last_invalidation(*tables_or_models, **kwargs)[source]

Returns the timestamp of the most recent invalidation of the given tables_or_models. If tables_or_models is not specified, all tables found in the database (including those outside Django) are used.

If cache_alias is specified, it only fetches invalidations in this cache, otherwise invalidations in all caches are fetched.

If db_alias is specified, it only fetches invalidations for this database, otherwise invalidations for all databases are fetched.

Parameters:
  • tables_or_models (tuple of strings or models) – SQL tables names, models or models lookups (or a combination)
  • cache_alias (string or NoneType) – Alias from the Django CACHES setting
  • db_alias (string or NoneType) – Alias from the Django DATABASES setting
Returns:

The timestamp of the most recent invalidation

Return type:

float

Benchmark

Introduction

This benchmark does not intend to be exhaustive nor fair to SQL. It shows how django-cachalot behaves on an unoptimised application. On an application using perfectly optimised SQL queries only, django-cachalot may not be useful. Unfortunately, most Django apps (including Django itself) use unoptimised queries. Of course, they often lack useful indexes (even though it only requires 20 characters per index…). But what you may not know is that the ORM currently generates totally unoptimised queries [1].

Conditions

In this benchmark, a small database is generated, and each test is executed 20 times under the following conditions:

CPU Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
RAM 24634516 kB
Disk SAMSUNG MZVPW256HEGL-00000
Linux distribution Ubuntu 18.04 bionic
Python 3.6.5
Django 2.0.5
cachalot 2.0.0
sqlite 3.22.0
PostgreSQL 10.3
MySQL 5.7.22
Redis 4.0.9
memcached 1.5.6
psycopg2 2.7.4
mysqlclient 1.3.12

Note that MySQL’s query cache is active during the benchmark.

Database results

  • mysql is 1.1× slower then 3.9× faster
  • postgresql is 1.1× slower then 8.6× faster
  • sqlite is 1.1× slower then 4.3× faster
_images/db.svg

Cache results

  • filebased is 1.1× slower then 5.4× faster
  • locmem is 1.1× slower then 5.8× faster
  • memcached is 1.1× slower then 5.2× faster
  • pylibmc is 1.1× slower then 5.4× faster
  • redis is 1.1× slower then 5.3× faster
_images/cache.svg

Cache detailed results

Redis

_images/cache_redis.svg
[1]The ORM fetches way too much data if you don’t restrict it using .only and .defer. You can divide the execution time of most queries by 2-3 by specifying what you want to fetch. But specifying which data we want for each query is very long and unmaintainable. An automation using field usage statistics is possible and would drastically improve performance. Other performance issues occur with slicing. You can often optimise a sliced query using a subquery, like YourModel.objects.filter(pk__in=YourModel.objects.filter(…)[10000:10050]).select_related(…) instead of YourModel.objects.filter(…).select_related(…)[10000:10050]. I’ll maybe work on these issues one day.

What could still be done

  • Cache raw queries (may not be possible due to database cursors being written in C)
  • Allow setting CACHALOT_CACHE to None in order to disable django-cachalot persistence. SQL queries would only be cached during transactions, so setting ATOMIC_REQUESTS to True would cache SQL queries only during a request-response cycle. This would be useful for websites with a lot of invalidations (social network for example), but with several times the same SQL queries in a single response-request cycle, as it occurs in Django admin.
  • Create a command to check clock synchronisation between remote servers

Bug reports, questions, discussion, new features

  • If you spotted a bug, please file a precise bug report on GitHub
  • If you have a question on how django-cachalot works or to simply discuss, chat with us on gitter.
  • If you want to add a feature:
    • if you have an idea on how to implement it, you can fork the project and send a pull request, but please open an issue first, because someone else could already be working on it
    • if you’re sure that it’s a must-have feature, open an issue
    • if it’s just a vague idea, please ask on gitter first

How django-cachalot works

Reverse engineering

It’s a lot of Django reverse engineering combined with a strong test suite. Such a test suite is crucial for a reverse engineering project. If some important part of Django changes and breaks the expected behaviour, you can be sure that the test suite will fail.

Monkey patching

Django-cachalot modifies Django in place during execution to add a caching tool just before SQL queries are executed. When a SQL query reads data, we save the result in cache. If that same query is executed later, we fetch that result from cache. When we detect INSERT, UPDATE or DELETE, we know which tables are modified. All the previous cached queries can therefore be safely invalidated.

Legacy

This work is highly inspired of johnny-cache, another easy-to-use ORM caching tool! It’s working with Django <= 1.5. I used it in production during 3 years, it’s an excellent module!

Unfortunately, we failed to make it migrate to Django 1.6 (I was involved). It was mostly because of the transaction system that was entirely refactored.

I also noticed a few advanced invalidation issues when using QuerySet.extra and some complex cases implying multi-table inheritance and related ManyToManyField.

What’s new in django-cachalot?

2.0.0

  • Adds Django 2.0 support
  • Drops Django 1.10 support
  • Drops Django 1.8 support (1.9 support was dropped in 1.5.0)
  • Adds a check to make sure it is used with a supported Django version
  • Fixes a bug partially breaking django-cachalot when an error occurred during the end of a transaction.atomic block, typically when using deferred constraints

1.5.0

  • Adds Django 1.11 support
  • Adds Python 3.6 support
  • Drops Django 1.9 support (but 1.8 is still supported)
  • Drops Python 3.3 support
  • Adds CACHALOT_DATABASES to specify which databases have django-cachalot enabled (by default, only supported databases are enabled)
  • Stops advising users to dynamically override cachalot settings as it cannot be thread-safe due to Django’s internals
  • Invalidates tables after raw CREATE, ALTER & DROP SQL queries
  • Allows specifying model lookups like auth.User in the API functions (previously, it could only be done in the Django template tag, not in the Jinja2 get_last_invalidation function nor in API functions)
  • Fixes the cache used by CachalotPanel if CACHALOT_CACHE is different from 'default'
  • Uploads a wheel distribution of this package to PyPI starting now, in addition of the source release
  • Improves tests

1.4.1

  • Fixes a circular import occurring when CachalotPanel is used and django-debug-toolbar is before django-cachalot in INSTALLED_APPS
  • Stops checking compatibility for caches other than CACHALOT_CACHE

1.4.0

  • Fixes a bad design: QuerySet.select_for_update was cached, but it’s not correct since it does not lock data in the database once data was cached, leading to the database lock being useless in some cases
  • Stops automatically invalidating other caches than CACHALOT_CACHE for consistency, performance, and usefulness reasons
  • Fixes a minor issue: the post_invalidation signal was sent during transactions when calling the invalidate command
  • Creates a gitter chat room
  • Removes the Slack team. Slack does not allow public chat, this was therefore a bad idea

1.3.0

  • Adds Django 1.10 support
  • Drops Django 1.7 support
  • Drops Python 3.2 support
  • Adds a Jinja2 extension with a cache statement and the get_last_invalidation function
  • Adds a CACHALOT_TIMEOUT setting after dozens of private & public requests, but it’s not really useful
  • Fixes a RuntimeError occurring if a DatabaseCache was used in a project, even if not used by django-cachalot
  • Allows bytes raw queries (except on SQLite where it’s not supposed to work)
  • Creates a Slack team to discuss, easier than using Google Groups

1.2.1

Mandatory update if you’re using django-cachalot 1.2.0.

This version reverts the cache keys hashing change from 1.2.0, as it was leading to a non-shared cache when Python used a random seed for hashing, which is the case by default on Python 3.3, 3.4, & 3.5, and also on 2.7 & 3.2 if you set PYTHONHASHSEED=random.

1.2.0

WARNING: This version is unsafe, it can lead to invalidation errors

  • Adds Django 1.9 support
  • Simplifies and speeds up cache keys hashing
  • Documents how to use django-cachalot with a replica database
  • Adds DummyCache to VALID_CACHE_BACKENDS
  • Updates the comparison with django-cache-machine & django-cacheops by checking features and measuring performance instead of relying on their documentations and a 2-years-ago experience of them

1.1.0

Backwards incompatible changes:

  • Adds Django 1.8 support and drops Django 1.6 & Python 2.6 support
  • Merges the 3 API functions invalidate_all, invalidate_tables, & invalidate_models into a single invalidate function while optimising it

Other additions:

  • Adds a get_last_invalidation function to the API and the equivalent template tag
  • Adds a CACHALOT_ONLY_CACHABLE_TABLES setting in order to make a whitelist of the only table names django-cachalot can cache
  • Caches queries with IP addresses, floats, or decimals in parameters
  • Adds a Django check to ensure the project uses compatible cache and database backends
  • Adds a lot of tests, especially to test django.contrib.postgres
  • Adds a comparison with django-cache-machine and django-cacheops in the documentation

Fixed:

  • Removes a useless extra invalidation during each write operation to the database, leading to a small speedup during data modification and tests

  • The post_invalidation signal was triggered during transactions and was not triggered when using the API or raw write queries: both issues are now fixed

  • Fixes a very unlikely invalidation issue occurring only when an error occurred in a transaction after a transaction of another database nested in the first transaction was committed, like this:

    from django.db import transaction
    
    assert list(YourModel.objects.using('another_db')) == []
    
    try:
        with transaction.atomic():
            with transaction.atomic('another_db'):
                obj = YourModel.objects.using('another_db').create(name='test')
            raise ZeroDivisionError
    except ZeroDivisionError:
        pass
    
    # Before django-cachalot 1.1.0, this assert was failing.
    assert list(YourModel.objects.using('another_db')) == [obj]
    

1.0.3

  • Fixes an invalidation issue that could rarely occur when querying on a BinaryField with PostgreSQL, or with some geographic queries (there was a small chance that a same query with different parameters could erroneously give the same result as the previous one)
  • Adds a CACHALOT_UNCACHABLE_TABLES setting
  • Fixes a Django 1.7 migrations invalidation issue in tests (that was leading to this error half of the time: RuntimeError: Error creating new content types. Please make sure contenttypes is migrated before trying to migrate apps individually.)
  • Optimises tests when using django-cachalot by avoid several useless cache invalidations

1.0.2

  • Fixes an AttributeError occurring when excluding through a many-to-many relation on a child model (using multi-table inheritance)
  • Stops caching queries with random subqueries – for example User.objects.filter(pk__in=User.objects.order_by('?'))
  • Optimises automatic invalidation
  • Adds a note about clock synchronisation

1.0.1

  • Fixes an invalidation issue discovered by Helen Warren that was occurring when updating a ManyToManyField after executing using .exclude on that relation. For example, Permission.objects.all().delete() was not invalidating User.objects.exclude(user_permissions=None)
  • Fixes a UnicodeDecodeError introduced with python-memcached 1.54
  • Adds a post_invalidation signal

1.0.0

Fixes a bug occurring when caching a SQL query using a non-ascii table name.

1.0.0rc

Added:

  • Adds an invalidate_cachalot command to invalidate django-cachalot from a script without having to clear the whole cache
  • Adds the benchmark introduction, conditions & results to the documentation
  • Adds a short guide on how to configure Redis as a LRU cache

Fixed:

  • Fixes a rare invalidation issue occurring when updating a many-to-many table after executing a queryset generating a HAVING SQL statement – for example, User.objects.first().user_permissions.add(Permission.objects.first()) was not invalidating User.objects.annotate(n=Count('user_permissions')).filter(n__gte=1)

  • Fixes an even rarer invalidation issue occurring when updating a many-to-many table after executing a queryset filtering nested subqueries by another subquery through that many-to-many table – for example:

    User.objects.filter(
        pk__in=User.objects.filter(
            pk__in=User.objects.filter(
                user_permissions__in=Permission.objects.all())))
    
  • Avoids setting useless cache keys by using table names instead of Django-generated table alias

0.9.0

Added:

  • Caches all queries implying Queryset.extra
  • Invalidates raw queries
  • Adds a simple API containing: invalidate_tables, invalidate_models, invalidate_all
  • Adds file-based cache support for Django 1.7
  • Adds a setting to choose if random queries must be cached
  • Adds 2 settings to customize how cache keys are generated
  • Adds a django-debug-toolbar panel
  • Adds a benchmark

Fixed:

  • Rewrites invalidation for a better speed & memory performance
  • Fixes a stale cache issue occurring when an invalidation is done exactly during a SQL request on the invalidated table(s)
  • Fixes a stale cache issue occurring after concurrent transactions
  • Uses an infinite timeout

Removed:

  • Simplifies cachalot_settings and forbids its use or modification

0.8.1

  • Fixes an issue with pip if Django is not yet installed

0.8.0

  • Adds multi-database support
  • Adds invalidation when altering the DB schema using migrate, syncdb, flush, loaddata commands (also invalidates South, if you use it)
  • Small optimizations & simplifications
  • Adds several tests

0.7.0

  • Adds thread-safety
  • Optimizes the amount of cache queries during transaction

0.6.0

  • Adds memcached support

0.5.0

  • Adds CACHALOT_ENABLED & CACHALOT_CACHE settings
  • Allows settings to be dynamically overridden using cachalot_settings
  • Adds some missing tests

0.4.1

  • Fixes pip install.

0.4.0 (install broken)

  • Adds Travis CI and adds compatibility for:
    • Django 1.6 & 1.7
    • Python 2.6, 2.7, 3.2, 3.3, & 3.4
    • locmem & Redis
    • SQLite, PostgreSQL, MySQL

0.3.0

  • Handles transactions
  • Adds lots of tests for complex cases

0.2.0

  • Adds a test suite
  • Fixes invalidation for data creation/deletion
  • Stops caching on queries defining select or where arguments with QuerySet.extra

0.1.0

Prototype simply caching all SQL queries reading the database and trying to invalidate them when SQL queries modify the database.

Has issues invalidating deletions and creations. Also caches QuerySet.extra queries but can’t reliably invalidate them. No transaction support, no test, no multi-database support, etc.