Evgenij LegotskojOct. 22, 2019, 11:39 a.m.

Django - Tutoral 049. Optimizing Django Performance with a Real Project

Recently I have devoted a lot of time to website optimization and now I would like to talk about it.
This article will explain the use of the select_related and prefetch_related methods in QuerySet, as well as their differences. I will also try to explain why Django is considered slow, and why this is still not the case. Of course, Django is slower in many ways than the same Flask, but at the same time in most projects the problem is not in Django itself, but rather in the absence of optimizing database queries.

Therefore, let's optimize the EVILEG website forum page . And the Django Silk battery will help us with this, which serves to measure the number of queries to the database, as well as measure their duration.

Install and configure Django Silk

Install Django Silk

pip install django-silk

Add it to INSTALLED_APPS

INSTALLED_APPS = [
    ...
    'silk'
]

and also add MIDDLEWARE

MIDDLEWARE = [
    ...
    'silk.middleware.SilkyMiddleware',
    ...
]

You also need to add urls from django-silk so that you can view request statistics.

from django.urls import path, include

urlpatterns = [
    path('silk/', include('silk.urls', namespace='silk'))
]

And the last step is applying the django-silk migration

python manage.py migrate

Note to Django Silk

Do not use Django Silk in a production server. At least with the settings shown in this article. If you already have good traffic on the site, for example 1,400 people a day, then with these settings Django Silk will simply eat all your resources. Therefore, experiment only on the development server.

Optimization Work

First, let's see how bad everything is with database queries on the forum’s main page. For clarity, let's see how this page looks.

To do this, we just need to download the page we are interested in and see the statistics of the request in Django Silk.

The first step in optimizing the EVILEG forum homepage

At the moment, all requests will be shown in debug mode.

The situation is depressing, because loading the main page of the forum accounts for:

  • 325 queries to the database
  • spent 155 ms
  • 568 ms full page

This is a very time-consuming task, especially since for each request a connection to the database is established, and then all the necessary data must still be loaded into the objects.

The resource consumption is huge. I think this is one of the reasons why many people consider Django to be slow, but in fact they just did not understand how to configure and optimize database queries.

We carry out optimization

Let's see what the original QuerySet looks like for the main page of the forum.

def get_queryset(self):
    return ForumTopic.objects.all()

As you can see, nothing complicated. It is such requests that are usually written at the very beginning of using Django ORM. And only then the questions begin, how to optimize the performance of Django.

The fact is that the initial queryset, which is required to display this page, takes only ForumTopic objects from the database, but does not take other objects that are added to ForeignKey fields of the ForumTopic data model. Therefore Django is forced to automatically load all immensely large objects when they are required. But the programmer knows what is required for each individual page and can indicate to Django all the immense objects that need to be picked up in advance with one request. Let's do it with select_related.

select_related

This method allows you to collect additional objects from other tables in one query. This will allow you to combine many queries into one and speed up the selection, as well as reduce the overhead of connecting to the database, since the number of connections is already very much reduced.

Let's try to select some data in one query using selected_related . I know that for my ForumTopic model, the following fields can be selected as related :

  • article - forum article
  • answer - reply, message that was marked as an answer in the forum thread
  • section - section in which the question was asked
  • user - user who asked this question

The initial database query can be modified as follows:

def get_queryset(self, **kwargs):
    return ForumTopic.objects.all().select_related('article', 'answer', 'section', 'user')

Then look at the result in Django Silk

Performance Improvement Using Select_related

The situation with the number of requests has become better

  • 256 queries to the database
  • spent 131 ms
  • 444 ms full page

The following figure shows a line with a new query that has 4 join operations.

As you can see, the duration of this request was 19.225 ms .

Already a good result. But I know for sure that this is not the limit. The fact is that the structure of the main page of the forum is quite complicated, and it shows the number of posts in each topic, the last message, a link to the answer of the solution, as well as a request to the user profile for answers. And here comes the turn of the prefetch_related method.

prefetch_related

prefetch_related differs in that it allows you to load not only objects that are used in ForeignKey fields of the model, but also those objects whose models have ForeignKey field on the model, which is involved in the main database request data. That is, you can load messages in a topic with a separate request. In this situation, I want to load the following fields.

  • comments - these are messages in a topic, ForumPost model
  • comments__user - foreign key on the user who left the message
  • answer___parent - ForumTopic's foreign key is the response that was marked with the topic’s resolution. Theoretically, it would be possible to pick up this object through select_related , but the query structure has become very complex, which would not allow select_related to be used efficiently. Yes, yes, the use of this method should be reasonable. Performance, of course, improves, but sometimes it’s better to collect some data with a separate request.

Then the database query already looks like this:

def get_queryset(self, **kwargs):
    return ForumTopic.objects.all().select_related('article', 'answer', 'section', 'user').prefetch_related(
        'comments', 'comments__user', 'answer___parent'
    )

And in Django Silk I get the following result

Using select_related and prefetch_related

As a result, we have the following:

  • 6 queries to the database
  • spent on 26 ms
  • 148 ms full page

This is just a great result that can be achieved. At the same time, the user already feels that the page is loading very quickly.

But this is not all, note that the request, which has 4 join operations, is still in the region of 17-20 ms. Can we do something about this? Of course we can, and for this we will need to use the only method.

only

The only method allows you to pick up only those columns that we need to display the page. But in this case, it will be necessary to take into account all the columns that are required, otherwise each missed Django column will be picked up by a separate request.

So I wrote the following database query

def get_queryset(self, **kwargs):
    return ForumTopic.objects.all().select_related('article', 'answer', 'section', 'user').prefetch_related(
        'comments', 'comments__user', 'answer___parent'
    ).only(
        'user__first_name', 'user__last_name', 'section__title', 'section__title_ru', 'article__title',
        'article__title_ru'
    )

And got the following result

  • 6 queries to the database
  • spent on 20 ms
  • 136 ms full page

Конечно, я привожу лучшие возможные результаты, поскольку всегда имеются некоторые колебания в измерениях, но по данному скриншоту видно, что длительность основного запроса снизилась с 17-19 мс до 11-13 мс . Помимо прочего выборка только нужных полей снижает и потребление памяти, если из базы данных забираются например очень крупные куски текстовых данных, которые при этом не используются в рендеринге страницы.

Now let's play around a bit with query select_related and prefetch_related

Additional optimization

Having read up to this point, you, I think, were convinced that use select_related allows to optimize database queries very coolly. But there is one BUT . Some problems may arise in using the Paginator class, which is used on my page. And the fact is that for Paginator it is necessary to execute the count request in order to calculate the correct number of pages. And if the request is very complicated, then the duration of the count request can be quite large and commensurate with the execution of a regular request. Therefore, an important condition may be writing a quick and effective main request, and all other objects will be better loaded using prefetch_related . That is, you may have a situation where it is better to complete a couple of additional requests, through overloading join operations with the main request.

And I wrote such a request to ORM for this page

def get_queryset(self, **kwargs):
    return ForumTopic.objects.all().select_related('answer').prefetch_related(
        Prefetch('article', queryset=Article.objects.all().only('title', 'title_ru')),
        Prefetch('section', queryset=ForumSection.objects.all().only('slug', 'title', 'title_ru')),
        Prefetch('user', queryset=User.objects.all().only('username', 'first_name', 'last_name')),
        Prefetch('comments', queryset=ForumPost.objects.all().select_related('user').only(
            'user__username', 'user__first_name', 'user__last_name', '_parent_id'
        )),
        Prefetch('answer___parent', queryset=ForumTopic.objects.all().only('id'))
    ).only(
        'title', 'user_id', 'section_id', 'article_id', 'answer___parent_id', 'pub_date', 'lastmod', 'attachment'
    )

At the same time, I got the following performance result

  • 8 queries to the database
  • spent 14 ms
  • 141 ms full page

Of course, you can say that in this case there is not a very big gain. Moreover, the overall download speed even dropped a little (5 ms), and there were 2 more requests to the database, but at the same time I got an increase in query performance by 42 percent , and this is already something worth it. Thus, if your site has very long queries that are used in pagination and have a large number of join operations, then it may be worth rewriting the use of select_related to prefetch_related . This can actually help make your Django site much faster.

Conclusion

  • Use select_related to select corresponding fields from other tables simultaneously with the main query
  • Use prefetch_related to additionally load with a single request all objects of other models that have ForeignKey on your main queryset
  • Use only to limit the columns to be taken, it will also speed up queries and reduce memory consumption
  • If you use Paginator , then make sure that the main request does not generate a very heavy request count , otherwise, it is possible that some select_related requests are loaded as prefetch_related
We recommend hosting TIMEWEB
We recommend hosting TIMEWEB
Stable hosting, on which the social network EVILEG is located. For projects on Django we recommend VDS hosting.
- company blog
Support the author Donate

Спасибо. Хорошая статья.

Я нашёл 2 опечатки. Выделил жирным.

prefetch_related
prefetch_related отличается тем, что позволяет подгрузить не только объекты, которые используются в ForeignKey полях модели, но и те объекты, модели...
Должно вроде быть "но и те ".

Дополнительная оптимизация
...То есть у вас может быть ситуация, когда лучше выполнить ещё пару дополнительных запросов, через перегружать join операциями основной запрос.
Тут видимо имелось ввиду чем .

Спасибо, поправил

p

Стоило бы упомянуть про Prefetch объекты со специально сформированными querysetами. Про кеширование. Помимо only есть defer. В некоторых случаях в drf можно автоматически делать select/prefetch_related. И запросы можно смотреть в django_debug_toolbar или в shell_plus --print-sql

Стоило бы упомянуть про Prefetch объекты со специально сформированными querysetами

Вы про это?

def get_queryset(self, **kwargs):
    return ForumTopic.objects.all().select_related('answer').prefetch_related(
        Prefetch('article', queryset=Article.objects.all().only('title', 'title_ru')),
        Prefetch('section', queryset=ForumSection.objects.all().only('slug', 'title', 'title_ru')),
        Prefetch('user', queryset=User.objects.all().only('username', 'first_name', 'last_name')),
        Prefetch('comments', queryset=ForumPost.objects.all().select_related('user').only(
            'user__username', 'user__first_name', 'user__last_name', '_parent_id'
        )),
        Prefetch('answer___parent', queryset=ForumTopic.objects.all().only('id'))
    ).only(
        'title', 'user_id', 'section_id', 'article_id', 'answer___parent_id', 'pub_date', 'lastmod', 'attachment'
    )

В некоторых случаях в drf можно автоматически делать select/prefetch_related

Для drf можно сделать отдельную статью, я вообще не рассматривал в данной статье drf

И запросы можно смотреть в django_debug_toolbar или в shell_plus --print-sql

В качестве альтернативы

D

Огромное спасибо вам за статью! Для меня стали открытием select_related и prefetch_related

Comments

Only authorized users can post comments.
Please, Log in or Sign up
Card image cap
Pulsum Via

Project for travelers from EVILEG.

Go
Fornex

Let me recommend you a great European Fornex hosting.

Fornex has proven itself to be a stable host over the years.

For Django projects I recommend VPS hosting

Following the link you will receive a 5% discount on shared hosting services, dedicated servers, VPS and VPN

View Hosting
Share on social networks
Donate

The EVILEG project has switched to a non-commercial basis and will develop solely on the enthusiasm of the site creator, the enthusiasm of users, donations and the hosting referral system

Thank you for your support

Available ways to support the project

PayPal

PatreonYandex.MoneyMore
M

C ++ - Test 004. Pointers, Arrays and Loops

  • Result:20points,
  • Rating points-10
k

C++ - Test 005. Structures and Classes

  • Result:83points,
  • Rating points4
k

C++ - Test 002. Constants

  • Result:58points,
  • Rating points-2
Last comments
Ds

Android and QML - Adding Splash Screen

Интересен формат иконки, если это png, то как решается проблема scalability? не растягивается ли лого на китайфонах с 1280х2500? У меня просто сплеш скрин с градиентом и логотипом, и вот несколь…
p

Qt/C++ - Lesson 023. Moving QGraphicsItem on QGraphicsScene with mouse help

FIGURE Abdominopelvic regions. Zjuaqd https://newfasttadalafil.com/ - Cialis Cialis Recommendations for preparing children and adolescents for invasive cardiac procedures a statement…
KG

How to use nested forms in Django

Спасибо за полезную статью. Подскажите пожалуйста, что делать если нужно реализовать большее количество вложенных форм? Например если на модель Address ссылается fk другой модели, на котору…

Qt/C++ - Lesson 051. QMediaPlayer – simple audio player

Не думаю, QMediaPlayer в один поток проигрывает. Если вам нужно одновременное воспроизведение нескольких аудиоисточников, то вам нужна Bass audio library , насколько знаю, её обычно и…
AG

Qt/C++ - Lesson 051. QMediaPlayer – simple audio player

есть такая вообще возможность ?
Now discuss on the forum
AB

Sorting the added QML elements in the ListModel

I am writing an alarm clock in QML, I am required to sort the alarms in ascending order (depending on the date or time (if there are several alarms on the same day). I've done the sorting …

Изменение поведения QGroupBox при клике на его чекбокс

Я вынес виджеты вынес за пределы QGroupBox в итоге.

QSqlRelatipnalTabelModel Qt 4.8.1 как получить id внешней связи?

Есть еще принципиально другой вариант решить раз и навсегда вопрос с полей id внешней связи. Это форкнуть Qt 4.8.1 QSqlTableModel, то есть создать свою ветку развития. Например создадим кл…

Добавление AndroidManifest.xml в cmake

Добрый день. Как добавить AndroidManifest.xml в cmake? Это не работвет set(ANDROID_PACKAGE_SOURCE_DIR ${PROJECT_SOURCE_DIR}/android CACHE INTERNAL "")set(ANDROID_BUID_DIR ${CMAKE_C…
s

Событие wheelEvent для виджета QLineEdit

вот что получилось: gui.py from PyQt5 import QtCore, QtGui, QtWidgets class LineEdit(QtWidgets.QLineEdit): def wheelEvent(self, event): #print("_") delta = 1 if e…
About
Services
© EVILEG 2015-2022
Recommend hosting TIMEWEB