Django ORM and Updating Counters

Counters are common in website development. Most websites collect the number of page views with counters. E-commerce websites keep track of the quantity of a commodity with counters. However, it is hard to implement a correct counter with Django ORM.

In this post, I would like to cover three ways to update counters in Django. The first and the second have race conditions. Skip to the third implementation if you would like to know the correct solution immediately.

Case Study: A Ticket Issuing App

To demonstrate the problem, I would like to build an app to issue tickets with serial numbers. Each user must get one integer representing the number of tickets that have been issued prior to the user. Apparently, the numbers must be distinct and all requests from users must be serialized in a certain order.

The source code has been uploaded to loganchien/django-counter-demo. There are 4 views in counter/views.py. These views will be elaborated in the upcoming sections.

To run the app with 10 workers concurrently, launch the app with:

$ gunicorn -w 10 counter_demo.wsgi

To test the correctness, stress_test.py will send 400 requests to get the serial numbers. The requests are divided into 4 threads, thus each thread is responsible for 100 requests. After collecting the numbers from each threads, the test script will count the number of distinct serial numbers. The expected answer is 400. Finally, the test script will send another request to get the latest number in the database. The expected answer is 401. The source code of the test script is listed below:

#!/usr/bin/env python3

import argparse
import multiprocessing
import traceback
import urllib.request

def get_number(url):
    with urllib.request.urlopen(url) as response:
        return int(response.read().decode('utf-8'))

def get_numbers(args):
    task_id, url, num_requests = args
    response_set = set()
    try:
        for i in range(num_requests):
            value = get_number(url)
            print('task', task_id, 'received', value)
            response_set.add(value)
    except Exception:
        traceback.print_exc()
    return response_set

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('url')
    parser.add_argument('-j', '--parallel', default=4, type=int)
    parser.add_argument('-n', '--request', default=100, type=int)
    args = parser.parse_args()

    tasks = [(i, args.url, args.request) for i in range(args.parallel)]

    with multiprocessing.Pool(processes=args.parallel) as pool:
        response_sets = pool.map(get_numbers, tasks)

    answer = set()
    for i, response_set in enumerate(response_sets):
        print('task', i, 'received', len(response_set), 'numbers')
        answer.update(response_set)

    print('total', len(answer))
    print('last extra request got:', get_number(args.url))

if __name__ == '__main__':
    main()

Naive but Incorrect

At the first glance, it looks straightforward to implement a counter. We can simply load an object with get_or_create(), add num_counts by one, and save the object with save():

def counter1(request):
    counter = Counter.objects.get_or_create(pk=1)[0]
    counter.num_counts += 1
    counter.save()

    return HttpResponse(str(counter.num_counts),
                        content_type='text/plain')

However, this doesn't work:

$ ./stress_test.py http://localhost:8000/counter1
...
task 0 received 100 numbers
task 1 received 100 numbers
task 2 received 100 numbers
task 3 received 100 numbers
total 109
last extra request got: 110

This total number is incorrect because there is a race condition between get_or_create() and save(). Another worker may update num_counts in the database after get_or_create() returns.

Using F() expressions

In Django apps, programmers may refer the value in the database with F() expressions. It will be translated into SQL statements and take effects after calling the save() method. After updating the database, you should call refresh_from_db() to load the latest value from the database:

def counter2(request):
    counter = Counter.objects.get_or_create(pk=2)[0]
    counter.num_counts = F('num_counts') + 1
    counter.save()

    counter.refresh_from_db()

    return HttpResponse(str(counter.num_counts),
                        content_type='text/plain')

It seems working. The total number and the last number seem correct:

$ ./stress_test.py http://localhost:8000/counter2
...
task 0 received 100 numbers
task 1 received 100 numbers
task 2 received 100 numbers
task 3 received 100 numbers
total 400
last extra request got: 401

However, this is not correct. There is a race condition between save() and refresh_from_db(). You can observe the race condition by sleeping randomly between save() and refresh_from_db():

def counter3(request):
    counter = Counter.objects.get_or_create(pk=3)[0]
    counter.num_counts = F('num_counts') + 1
    counter.save()

    # Demonstrate race condition by sleeping randomly
    # between save() and refresh_from_db().
    time.sleep(random.randint(0, 5) / 1000)

    counter.refresh_from_db()

    return HttpResponse(str(counter.num_counts),
                        content_type='text/plain')

Now, here is the problem:

$ ./stress_test.py http://localhost:8000/counter3
...
task 0 received 100 numbers
task 1 received 100 numbers
task 2 received 100 numbers
task 3 received 100 numbers
total 260
last extra request got: 401

The value in the database is correct. However, some users incorrectly get the numbers that are created by other transactions.

Some users get the incorrect numbers because refresh_from_db() retrieves the latest version in the database. Unfortunately, AFAIK, all Django model updating functions, including save() and update(), do not return the updated values.

Using select_for_update()

The third implementation makes queries with select_for_update(). Unlike select() or get_or_create(), select_for_update() will acquire the write lock of the selected rows so that other transactions won't be able to update the row until the transaction ends. Furthermore, if two select_for_update() are competing for a row, the later one is blocked until the lock is released.

The code below starts a transaction with transaction.atomic(), load the object with select_for_update(), compute num_counts in Python, and save the updated object to database with save().

def counter4(request):
    Counter.objects.get_or_create(pk=4)

    with transaction.atomic():
        counter = Counter.objects.filter(id=4).select_for_update()[0]
        num_counts = counter.num_counts + 1
        counter.num_counts = num_counts
        counter.save()

    return HttpResponse(str(num_counts), content_type='text/plain')

Since other transactions won't be able to update the object in the middle of the transaction, the local variable num_counts is guaranteed to be the number for this user:

$ ./stress_test.py http://localhost:8000/counter4
...
task 0 received 100 numbers
task 1 received 100 numbers
task 2 received 100 numbers
task 3 received 100 numbers
total 400
last extra request got: 401

As expected, the answer is correct.

However, only PostgreSQL, Oracle, and MySQL support select_for_update(). Some backend will ignore select_for_update() silently. Thus, it is important to choose a right database backend.

Conclusion

In this post, we covered three different ways to update counters:

  • The first implementation doesn't work at all.
  • The second implementation uses F() expressions. The value in the database is correct, but this implementation won't be able to get the updated value. As a result, the second implementation can't solve the ticket issuing problem. However, if your application does not impose strong requirements on the numbers returned to the users, this is still a good solution.
  • The third implementation uses select_for_update() to acquire the write lock of the selected row. This correctly solves the ticket issuing problem. However, only PostgreSQL, Oracle, and MySQL support select_for_update().

Concurrent programming is always difficult. Even though updating counters looks simple, there are many pitfalls. But that's the reason why programming is fun.

References