Query Expressions

Query expressions describe a value or a computation that can be used as part of a filter, order by, annotation, or aggregate. There are a number of built-in expressions (documented below) that can be used to help you write queries. Expressions can be combined, or in some cases nested, to form more complex computations.

Supported arithmetic

Django supports addition, subtraction, multiplication, division, modulo arithmetic, and the power operator on query expressions, using Python constants, variables, and even other expressions.

New in Django 1.7:

Support for the power operator ** was added.

Some examples

Changed in Django 1.8:

Some of the examples rely on functionality that is new in Django 1.8.

# Find companies that have more employees than chairs.
Company.objects.filter(num_employees__gt=F('num_chairs'))

# Find companies that have at least twice as many employees
# as chairs. Both the querysets below are equivalent.
Company.objects.filter(num_employees__gt=F('num_chairs') * 2)
Company.objects.filter(
    num_employees__gt=F('num_chairs') + F('num_chairs'))

# How many chairs are needed for each company to seat all employees?
>>> company = Company.objects.filter(
...    num_employees__gt=F('num_chairs')).annotate(
...    chairs_needed=F('num_employees') - F('num_chairs')).first()
>>> company.num_employees
120
>>> company.num_chairs
50
>>> company.chairs_needed
70

# Annotate models with an aggregated value. Both forms
# below are equivalent.
Company.objects.annotate(num_products=Count('products'))
Company.objects.annotate(num_products=Count(F('products')))

# Aggregates can contain complex computations also
Company.objects.annotate(num_offerings=Count(F('products') + F('services')))

# Expressions can also be used in order_by()
Company.objects.order_by(Length('name').asc())
Company.objects.order_by(Length('name').desc())

Built-in Expressions

F() expressions

class F

An F() object represents the value of a model field or annotated column. It makes it possible to refer to model field values and perform database operations using them without actually having to pull them out of the database into Python memory.

Instead, Django uses the F() object to generate a SQL expression that describes the required operation at the database level.

This is easiest to understand through an example. Normally, one might do something like this:

# Tintin filed a news story!
reporter = Reporters.objects.get(name='Tintin')
reporter.stories_filed += 1
reporter.save()

Here, we have pulled the value of reporter.stories_filed from the database into memory and manipulated it using familiar Python operators, and then saved the object back to the database. But instead we could also have done:

from django.db.models import F
reporter = Reporters.objects.get(name='Tintin')
reporter.stories_filed = F('stories_filed') + 1
reporter.save()

Although reporter.stories_filed = F('stories_filed') + 1 looks like a normal Python assignment of value to an instance attribute, in fact it’s an SQL construct describing an operation on the database.

When Django encounters an instance of F(), it overrides the standard Python operators to create an encapsulated SQL expression; in this case, one which instructs the database to increment the database field represented by reporter.stories_filed.

Whatever value is or was on reporter.stories_filed, Python never gets to know about it - it is dealt with entirely by the database. All Python does, through Django’s F() class, is create the SQL syntax to refer to the field and describe the operation.

Note

In order to access the new value that has been saved in this way, the object will need to be reloaded:

reporter = Reporters.objects.get(pk=reporter.pk)

As well as being used in operations on single instances as above, F() can be used on QuerySets of object instances, with update(). This reduces the two queries we were using above - the get() and the save() - to just one:

reporter = Reporters.objects.filter(name='Tintin')
reporter.update(stories_filed=F('stories_filed') + 1)

We can also use update() to increment the field value on multiple objects - which could be very much faster than pulling them all into Python from the database, looping over them, incrementing the field value of each one, and saving each one back to the database:

Reporter.objects.all().update(stories_filed=F('stories_filed) + 1)

F() therefore can offer performance advantages by:

  • getting the database, rather than Python, to do work
  • reducing the number of queries some operations require

Avoiding race conditions using F()

Another useful benefit of F() is that having the database - rather than Python - update a field’s value avoids a race condition.

If two Python threads execute the code in the first example above, one thread could retrieve, increment, and save a field’s value after the other has retrieved it from the database. The value that the second thread saves will be based on the original value; the work of the first thread will simply be lost.

If the database is responsible for updating the field, the process is more robust: it will only ever update the field based on the value of the field in the database when the save() or update() is executed, rather than based on its value when the instance was retrieved.

Using F() in filters

F() is also very useful in QuerySet filters, where they make it possible to filter a set of objects against criteria based on their field values, rather than on Python values.

This is documented in using F() expressions in queries.

Func() expressions

New in Django 1.8.

Func() expressions are the base type of all expressions that involve database functions like COALESCE and LOWER, or aggregates like SUM. They can be used directly:

queryset.annotate(field_lower=Func(F('field'), function='LOWER'))

or they can be used to build a library of database functions:

class Lower(Func):
    function = 'LOWER'

queryset.annotate(field_lower=Lower(F('field')))

But both cases will result in a queryset where each model is annotated with an extra attribute field_lower produced, roughly, from the following SQL:

SELECT
    ...
    LOWER("app_label"."field") as "field_lower"

See Database Functions for a list of built-in database functions.

The Func API is as follows:

class Func(*expressions, **extra)
function

A class attribute describing the function that will be generated. Specifically, the function will be interpolated as the function placeholder within template. Defaults to None.

template

A class attribute, as a format string, that describes the SQL that is generated for this function. Defaults to '%(function)s(%(expressions)s)'.

arg_joiner

A class attribute that denotes the character used to join the list of expressions together. Defaults to ', '.

The *expressions argument is a list of positional expressions that the function will be applied to. The expressions will be converted to strings, joined together with arg_joiner, and then interpolated into the template as the expressions placeholder.

The **extra kwargs are key=value pairs that can be interpolated into the template attribute. Note that the keywords function and template can be used to replace the function and template attributes respectively, without having to define your own class. output_field can be used to define the expected return type.

Aggregate() expressions

An aggregate expression is a special case of a Func() expression that informs the query that a GROUP BY clause is required. All of the aggregate functions, like Sum() and Count(), inherit from Aggregate().

Since Aggregates are expressions and wrap expressions, you can represent some complex computations:

Company.objects.annotate(
    managers_required=(Count('num_employees') / 4) + Count('num_managers'))

The Aggregate API is as follows:

class Aggregate(expression, output_field=None, **extra)
template

A class attribute, as a format string, that describes the SQL that is generated for this aggregate. Defaults to '%(function)s( %(expressions)s )'.

function

A class attribute describing the aggregate function that will be generated. Specifically, the function will be interpolated as the function placeholder within template. Defaults to None.

The expression argument can be the name of a field on the model, or another expression. It will be converted to a string and used as the expressions placeholder within the template.

The output_field argument requires a model field instance, like IntegerField() or BooleanField(), into which Django will load the value after it’s retrieved from the database. Usually no arguments are needed when instantiating the model field as any arguments relating to data validation (max_length, max_digits, etc.) will not be enforced on the expression’s output value.

Note that output_field is only required when Django is unable to determine what field type the result should be. Complex expressions that mix field types should define the desired output_field. For example, adding an IntegerField() and a FloatField() together should probably have output_field=FloatField() defined.

Changed in Django 1.8:

output_field is a new parameter.

The **extra kwargs are key=value pairs that can be interpolated into the template attribute.

New in Django 1.8:

Aggregate functions can now use arithmetic and reference multiple model fields in a single function.

Creating your own Aggregate Functions

Creating your own aggregate is extremely easy. At a minimum, you need to define function, but you can also completely customize the SQL that is generated. Here’s a brief example:

class Count(Aggregate):
    # supports COUNT(distinct field)
    function = 'COUNT'
    template = '%(function)s(%(distinct)s%(expressions)s)'

    def __init__(self, expression, distinct=False, **extra):
        super(Count, self).__init__(
            expression,
            distinct='DISTINCT ' if distinct else '',
            output_field=IntegerField(),
            **extra)

Value() expressions

class Value(value, output_field=None)

A Value() object represents the smallest possible component of an expression: a simple value. When you need to represent the value of an integer, boolean, or string within an expression, you can wrap that value within a Value().

You will rarely need to use Value() directly. When you write the expression F('field') + 1, Django implicitly wraps the 1 in a Value(), allowing simple values to be used in more complex expressions.

The value argument describes the value to be included in the expression, such as 1, True, or None. Django knows how to convert these Python values into their corresponding database type.

The output_field argument should be a model field instance, like IntegerField() or BooleanField(), into which Django will load the value after it’s retrieved from the database. Usually no arguments are needed when instantiating the model field as any arguments relating to data validation (max_length, max_digits, etc.) will not be enforced on the expression’s output value.

Conditional expressions

New in Django 1.8.

Conditional expressions allow you to use if ... elif ... else logic in queries. Django natively supports SQL CASE expressions. For more details see Conditional Expressions.

Technical Information

Below you’ll find technical implementation details that may be useful to library authors. The technical API and examples below will help with creating generic query expressions that can extend the built-in functionality that Django provides.

Expression API

Query expressions implement the query expression API, but also expose a number of extra methods and attributes listed below. All query expressions must inherit from ExpressionNode() or a relevant subclass.

When a query expression wraps another expression, it is responsible for calling the appropriate methods on the wrapped expression.

class ExpressionNode
contains_aggregate

Tells Django that this expression contains an aggregate and that a GROUP BY clause needs to be added to the query.

resolve_expression(query=None, allow_joins=True, reuse=None, summarize=False)

Provides the chance to do any pre-processing or validation of the expression before it’s added to the query. resolve_expression() must also be called on any nested expressions. A copy() of self should be returned with any necessary transformations.

query is the backend query implementation.

allow_joins is a boolean that allows or denies the use of joins in the query.

reuse is a set of reusable joins for multi-join scenarios.

summarize is a boolean that, when True, signals that the query being computed is a terminal aggregate query.

get_source_expressions()

Returns an ordered list of inner expressions. For example:

>>> Sum(F('foo')).get_source_expressions()
[F('foo')]
set_source_expressions(expressions)

Takes a list of expressions and stores them such that get_source_expressions() can return them.

relabeled_clone(change_map)

Returns a clone (copy) of self, with any column aliases relabeled. Column aliases are renamed when subqueries are created. relabeled_clone() should also be called on any nested expressions and assigned to the clone.

change_map is a dictionary mapping old aliases to new aliases.

Example:

def relabeled_clone(self, change_map):
    clone = copy.copy(self)
    clone.expression = self.expression.relabeled_clone(change_map)
    return clone
convert_value(self, value, connection, context)

A hook allowing the expression to coerce value into a more appropriate type.

refs_aggregate(existing_aggregates)

Returns a tuple containing the (aggregate, lookup_path) of the first aggregate that this expression (or any nested expression) references, or (False, ()) if no aggregate is referenced. For example:

queryset.filter(num_chairs__gt=F('sum__employees'))

The F() expression here references a previous Sum() computation which means that this filter expression should be added to the HAVING clause rather than the WHERE clause.

In the majority of cases, returning the result of refs_aggregate on any nested expression should be appropriate, as the necessary built-in expressions will return the correct values.

get_group_by_cols()

Responsible for returning the list of columns references by this expression. get_group_by_cols() should be called on any nested expressions. F() objects, in particular, hold a reference to a column.

asc()

Returns the expression ready to be sorted in ascending order.

desc()

Returns the expression ready to be sorted in descending order.

reverse_ordering()

Returns self with any modifications required to reverse the sort order within an order_by call. As an example, an expression implementing NULLS LAST would change its value to be NULLS FIRST. Modifications are only required for expressions that implement sort order like OrderBy. This method is called when reverse() is called on a queryset.

Writing your own Query Expressions

You can write your own query expression classes that use, and can integrate with, other query expressions. Let’s step through an example by writing an implementation of the COALESCE SQL function, without using the built-in Func() expressions.

The COALESCE SQL function is defined as taking a list of columns or values. It will return the first column or value that isn’t NULL.

We’ll start by defining the template to be used for SQL generation and an __init__() method to set some attributes:

import copy
from django.db.models import ExpressionNode

class Coalesce(ExpressionNode):
    template = 'COALESCE( %(expressions)s )'

    def __init__(self, expressions, output_field, **extra):
      super(Coalesce, self).__init__(output_field=output_field)
      if len(expressions) < 2:
          raise ValueError('expressions must have at least 2 elements')
      for expression in expressions:
          if not hasattr(expression, 'resolve_expression'):
              raise TypeError('%r is not an Expression' % expression)
      self.expressions = expressions
      self.extra = extra

We do some basic validation on the parameters, including requiring at least 2 columns or values, and ensuring they are expressions. We are requiring output_field here so that Django knows what kind of model field to assign the eventual result to.

Now we implement the pre-processing and validation. Since we do not have any of our own validation at this point, we just delegate to the nested expressions:

def resolve_expression(self, query=None, allow_joins=True, reuse=None, summarize=False):
    c = self.copy()
    c.is_summary = summarize
    for pos, expression in enumerate(self.expressions):
        c.expressions[pos] = expression.resolve_expression(query, allow_joins, reuse, summarize)
    return c

Next, we write the method responsible for generating the SQL:

def as_sql(self, compiler, connection):
    sql_expressions, sql_params = [], []
    for expression in self.expressions:
        sql, params = compiler.compile(expression)
        sql_expressions.append(sql)
        sql_params.extend(params)
    self.extra['expressions'] = ','.join(sql_expressions)
    return self.template % self.extra, sql_params

def as_oracle(self, compiler, connection):
    """
    Example of vendor specific handling (Oracle in this case).
    Let's make the function name lowercase.
    """
    self.template = 'coalesce( %(expressions)s )'
    return self.as_sql(compiler, connection)

We generate the SQL for each of the expressions by using the compiler.compile() method, and join the result together with commas. Then the template is filled out with our data and the SQL and parameters are returned.

We’ve also defined a custom implementation that is specific to the Oracle backend. The as_oracle() function will be called instead of as_sql() if the Oracle backend is in use.

Finally, we implement the rest of the methods that allow our query expression to play nice with other query expressions:

def get_source_expressions(self):
    return self.expressions

def set_source_expressions(expressions):
    self.expressions = expressions

Let’s see how it works:

>>> qs = Company.objects.annotate(
...    tagline=Coalesce([
...        F('motto'),
...        F('ticker_name'),
...        F('description'),
...        Value('No Tagline')
...        ], output_field=CharField()))
>>> for c in qs:
...     print("%s: %s" % (c.name, c.tagline))
...
Google: Do No Evil
Apple: AAPL
Yahoo: Internet Company
Django Software Foundation: No Tagline