In Django, One-to-Many relations are modeled by the ForeignKey
. Under
some circumstances, we may wish to collect some information (e.g. count,
min, max, avg, etc) from related objects. For example, in a blog
app, we may wish to build a table which lists the title and the number of
comments in each posts. This feature can be easily implemented with
QuerySet.annotate() and Aggregation. In this post, I will go through a
simple example that covers the basic usages of Django Aggregation.
Let's start with the Post
and Comment
model. In
example/models.py
, two models are defined. The Comment
model
has a ForeignKey
field, thus many Comment
objects may be related
to a Post
object:
from django.db import models
class Post(models.Model):
title = models.CharField(max_length=255)
class Comment(models.Model):
post = models.ForeignKey('Post', on_delete=models.CASCADE)
comment = models.CharField(max_length=255)
hidden = models.BooleanField(default=False)
num_thumbs_up = models.IntegerField(default=0)
Initialize the database with:
python manage.py makemigrations
python manage.py migrate
Start the Django shell with:
python manage.py shell
Then, build some test data with:
from example.models import Post, Comment
x = Post.objects.create(title='x')
y = Post.objects.create(title='y')
x1 = Comment.objects.create(post=x, comment='x1', num_thumbs_up=5)
x2 = Comment.objects.create(post=x, comment='x2', num_thumbs_up=10)
y1 = Comment.objects.create(post=y, comment='y1', num_thumbs_up=1)
y2 = Comment.objects.create(post=y, comment='y2', num_thumbs_up=3)
y3 = Comment.objects.create(post=y, comment='y3', num_thumbs_up=5)
y4 = Comment.objects.create(post=y, comment='y4', num_thumbs_up=7,
hidden=True)
Here comes our first example. To list posts with the number of comments of each
posts, we have to annotate our query with QuerySet.annotate() function. The
code below annotate the query with an extra field named num_comments
.
The value of num_comments
is derived from
models.Count('comment')
:
from django.db import models
ps = Post.objects.annotate(num_comments=models.Count('comment')).all()
for p in ps:
print(p.title, p.num_comments)
The output shows that there are two comments in the first post, which has
x
as its title, and four comments in the second post, which has
y
as its title:
x 2
y 4
We may add some conditions while counting the related objects. For example, we
may wish to count the good comments with 6 or more thumbs up. This can be
implemented by adding a filter
argument to the Count()
function.
The filter argument should be a Q() object, which specifies extra SQL
expressions.
In this example, comment__num_thumbs_up__gte
(separated by double
underscores) means that the num_thumbs_up
field of the related
comment
object must be greater than or equal to (gte
) 6:
ps = Post.objects \
.annotate(num_comments=models.Count('comment')) \
.annotate(num_good_comments=models.Count(
'comment', filter=models.Q(comment__num_thumbs_up__gte=6))) \
.all()
for p in ps:
print(p.title, p.num_comments, p.num_good_comments)
The output prints both num_comments
and num_good_comments
:
x 2 1
y 4 1
Q()
objects accept more than one arguments. If two (or more) arguments
are passed, only the objects that pass all tests are counted. For example, if
we would like to hide the hidden comments from the output, we can add
comment__hidden=False
to the Q()
objects:
ps = Post.objects \
.annotate(num_comments=models.Count(
'comment', filter=models.Q(comment__hidden=False))) \
.annotate(num_good_comments=models.Count(
'comment', filter=models.Q(
comment__hidden=False, comment__num_thumbs_up__gte=6))) \
.all()
for p in ps:
print(p.title, p.num_comments, p.num_good_comments)
The output no longer counts the hidden comment in post y
:
x 2 1
y 3 0
Last, we can list the minimum (Min()
), average (Avg()
), and
maximum (Max()
) value of the num_thumbs_up
field with:
ps = Post.objects \
.annotate(min_thumbs_up=models.Min('comment__num_thumbs_up')) \
.annotate(avg_thumbs_up=models.Avg('comment__num_thumbs_up')) \
.annotate(max_thumbs_up=models.Max('comment__num_thumbs_up')) \
.all()
for p in ps:
print(p.title, p.min_thumbs_up, p.avg_thumbs_up, p.max_thumbs_up)
The output shows that all comments in the first post have at least 5 thumbs up, 7.5 thumbs up on average, and at most 10 thumbs up and so forth:
x 5 7.5 10
y 1 4.0 7