Django ORM Optimisations
- General
Django ORM Optimisations
In this blog, we will explore various optimization techniques in Django ORM such as subqueries, select_related, prefetch_related, and more. These methods not only guide us in consolidating multiple queries into a single, efficient query but also play a crucial role in minimizing the time it takes to execute these queries.
Optimization holds a crucial position in effective database management, especially in web development. As applications evolve in complexity and data volume, inefficient queries can lead to sluggish performance, adversely impacting the user experience. Through the adoption of optimization techniques, our goal is to significantly enhance the efficiency and responsiveness of our Django applications.
1. Select Related and Prefetch Related
One of the most common ways to optimize Django ORM queries is to use the select_related()
and prefetch_related()
methods. The select_related()
method retrieves related objects in a single SQL query, while prefetch_related()
retrieves related objects in separate SQL queries.
The select_related()
method is particularly useful when dealing with ForeignKey and OneToOneField relationships, as it can significantly reduce the number of queries executed, leading to faster data retrieval.
Unlike select_related()
, prefetch_related()
is more suitable for optimizing Many-to-Many and reverse ForeignKey relationships. It fetches the related objects in a separate query but does so more efficiently than the default behavior, reducing the number of queries executed.
For example, if you have a model called “Author” with a foreign key relationship to a model called “Book”, you can use the select_related()
method to retrieve author details for a specific book:
1 2 3 4 5 6 7 8 9 |
# Models class Author(models.Model): name = models.CharField(max_length=100) age = models.IntegerField() class Book(models.Model): title = models.CharField(max_length=200) author = models.ForeignKey(Author, on_delete=models.CASCADE) publication_date = models.DateField() |
1 2 |
# ORM Query author = Book.objects.select_related('author').get(pk=1) |
1 2 3 4 |
-- MySQL Query: SELECT * FROM book INNER JOIN author ON author.id = book.author_id WHERE book.id = 1; |
Similarly, for fetching all the books of a specific author we can use the prefetch_related()
method:
1 2 3 4 5 |
# ORM Query author_books = Author.objects.prefetch_related('book_set').get(name__iexact="Rahul") for book in author_books.book_set.all(): print(book.title) |
1 2 3 4 5 6 |
-- MySQL Querys: SELECT * FROM author WHERE author.name LIKE "Rahul"; SELECT * FROM book WHERE book.author_id IN (1) |
2. Values and Values List
In Django ORM, the values()
and values_list()
methods are used to retrieve specific fields from the database. These methods can be useful when you only need a subset of the data.
The values()
method returns a QuerySet that represents a list of dictionaries containing key-value pairs.
The values_list()
method returns a QuerySet that represents a list of tuples.
For example, to fetch first_name and last_name from a User model we can use the values()
method like this:
1 2 3 4 5 6 |
user = User.objects.values('first_name') print(user) # Output: <QuerySet [{'first_name': 'John'}, {'first_name': 'Jane'}]> |
To fetch data in the form of a list of tuples we can use the values_list()
method like this:
1 2 3 4 5 |
user = User.objects.values_list() print(list(user)) # Output: [(1, 'John', 'Doe', 30), (2, 'Jane', 'Doe', 25)] |
3. Aggregate and Annotate
In Django ORM, the annotate()
and aggregate()
methods perform complex queries on the database. The annotate()
method adds extra fields to each object in the query set, while the aggregate()
method performs calculations on the entire query set.
For example, if you have a model called “Order” with a field called “total”, you can use the annotate()
method to add a field called “discounted_total” that subtracts 10% from the total:
1 2 3 4 |
# ORM Query from django.db.models import F orders = Order.objects.annotate(discounted_total=F('total') * 0.9) |
1 2 3 |
-- MySQL Query SELECT *, total * 0.9 AS discounted_total FROM enterprise_order; |
This will add a field called “discounted_total” to each order object in the query set.
Similarly, if you want to calculate the total of all orders, you can use the aggregate()
method:
1 2 3 4 |
# ORM Query from django.db.models import Sum total = Order.objects.aggregate(Sum('total')) |
1 2 3 |
-- MySQL Query SELECT SUM(total) AS total FROM enterprise_order; |
This will calculate the sum of the “total” field for all the orders in the database.
4. Conditional Expressions
In Django ORM, conditional expressions provide a way to annotate QuerySets with values that depend on certain conditions. This allows for more complex queries and can be useful for calculations or aggregations that involve multiple fields.
A conditional expression takes the form of an ExpressionWrapper object, which wraps an expression that evaluates to a boolean value, along with the expressions to evaluate if the condition is true or false. The Case and When expressions are commonly used to create conditional expressions in Django.
Here is an example of how to use conditional expressions to annotate a QuerySet with a value based on a condition:
1 2 3 4 5 6 7 8 9 10 |
# Annotate a QuerySet with a value that depends on a condition user = User.objects.annotate( category=Case( When(age__lt=18, then=Value('underage')), When(age__gte=18, then=Value('adult')), default=Value('underage'), output_field=CharField() ) ) |
1 2 3 4 5 6 7 8 |
-- MySQL Query SELECT *, CASE WHEN age < 18 THEN 'underage' WHEN age >= 18 THEN 'adult' ELSE 'underage' END AS category FROM enterprise_user; |
Here, we are annotating a new field called category, which depends on age. If age < 18 => category = underage and if age >= 18 => category = adult.
The Case
expression defines the conditions and their corresponding values, and the output_field
parameter specifies the data type of the annotated field.
5. Subquery and OuterRef
In Django ORM, the subqueries can be used to perform complex queries that involve nested queries. In Django, we can do so by using Subquery()
and OuterRef()
methods.
Using subquery in filters:
1 2 3 4 5 6 7 8 9 10 |
from django.db.models import Subquery, OuterRef # Get all books whose author has published a book in the "Mystery" category books = Book.objects.filter( author__in=Subquery( Author.objects.filter( book__category='Mystery' ) ) ) |
1 2 3 4 5 6 7 8 9 10 11 12 |
-- MySQL Query SELECT * FROM book WHERE author_id IN ( SELECT author_id FROM author WHERE id IN ( SELECT author_id FROM book WHERE category = 'Mystery' ) ); |
We can also use subquery with annotate like this:
1 2 3 4 5 6 7 8 9 10 |
# Get the average rating of books by each author and annotate the results authors = Author.objects.annotate( average_rating=Subquery( Book.objects.filter( author=OuterRef('pk') ).annotate( avg_rating=Avg('rating') ).values('avg_rating')[:1] ) ) |
1 2 3 4 5 6 7 8 9 |
-- MySQL Query SELECT *, ( SELECT AVG(rating) FROM book WHERE author_id = a.id LIMIT 1 ) AS average_rating FROM author a; |
6. Raw SQL Queries
Sometimes, complex queries cannot be performed using Django’s ORM. In such cases, we can use raw SQL queries to retrieve data from the database.
We can use django.db.connection
to perform raw SQL queries. Here, is an example:
1 2 3 4 5 |
from django.db import connection cursor = connection.cursor() cursor.execute("SELECT * FROM myapp_mymodel") row = cursor.fetchone() |
This will execute raw SQL query, and retrieve the first row of result.
7. F Expressions
An F()
object represents the value of a model field or an annotated column. We can use the F expression like this:
1 2 3 4 5 6 7 8 |
# Don't for user in User.objects.all(): user.age += 1 user.save() # Do # Using F expressions with update for efficient age increment across all users User.objects.update(age=F(‘age’) + 1) |
8. Bonus Tip
These tips are not related to Django ORM, but following these while designing a database can help reduce query execution time significantly.
-
Database Indexing
Identify frequently used fields in your queries and apply appropriate database indexes. Indexing improves query performance by reducing the time taken to find matching records. Django provides support for defining indexes on model fields, leveraging the power of database indexing.
-
Partitioning
Partitioning involves dividing a large table into smaller, more manageable partitions based on specific criteria such as range, list, or hash partitioning. Partitioning can enhance query performance by allowing the DBMS to perform operations on a smaller subset of data.
9. Further Reading
For additional insights and advanced techniques, explore the following methods:
- Bulk Create
For creating objects in bulk we can use bulk_create() like this:
12345# don'tfor user in users_list:User.objects.create(user)# doUser.objects.bulk_create(users_list)
- Defer and Only
defer()
is used to exclude certain fields from being fetched immediately when querying the database.
only()
is used to fetch only the specified fields immediately, excluding all other fields.
12345# defer()queryset = MyModel.objects.defer('description', 'created_at')# only()queryset = MyModel.objects.only('name')
- Iterator Function
Django provides a QuerySet method callediterator()
that allows you to iterate over query results more efficiently, fetching rows from the database one at a time rather than loading the entire result set into memory.
1234queryset = MyModel.objects.all()for instance in queryset.iterator():print(instance.name) - Count and Exists
If you don’t need the contents of the QuerySet usecount()
andexists()
. For example:
12user_exists = User.objects.exists()user_count = User.objects.count()
10. Key Takeaways
Optimizing Django queries is an essential part of building scalable and performant web applications. In order to write efficient code, it’s essential to understand how these concepts of optimizations work in Django.
In this blog, we have covered:
- Use select_related to minimize the number of queries when accessing related objects.
- Use prefetch_related to retrieve related objects in separate queries.
- Use values and values_list to retrieve specific fields from QuerySet.
- Use annotate and aggregate to perform calculations and aggregation on QuerySet.
- Use Subquery and OuterRef to perform subqueries and reference values from the outer query.
- Use Case/When conditional expression to fetch results based on some conditions.
- The Further Reading section contains some extra useful functions.
11. References
Refer to the following resources for more information on Django ORM optimization and related topics:
- Auriga IT Blog: Optimizing Database Operations in Django
- Django Documentation: QuerySet API Reference
Related content
Auriga: Leveling Up for Enterprise Growth!
Auriga’s journey began in 2010 crafting products for India’s