mafulechka
mafulechkaJuly 10, 2019, 3:17 a.m.

Improving performance with Qt 3D Studio 2.4

3D rendering speed is important for a 3D engine in addition to efficient use of system resources. The upcoming new release of Qt 3D Studio 2.4 significantly improves rendering performance as well as further savings on CPU and RAM resources. In our example of a high-quality embedded 3D application, rendering speed increased by a whopping 565%, while RAM usage and CPU usage decreased by 20% and 51%, respectively.


Performance is a key factor for Qt and is especially important for the ability to run complex 3D applications on embedded devices. We have continuously improved resource efficiency with earlier releases of Qt 3D Studio, and with the release of Qt 3D Studio 2.4 we have taken a big step forward in rendering performance. The exact increase in performance depends a lot on the application and the hardware used, so we took two sample applications and embedded hardware for a closer look. The sample applications used in this post are automotive tool clusters, but a similar improvement can be seen in any application using the Qt 3D Studio runtime.

Entry level example with Renesas R-Car D3

The entry-level embedded measurement device is the Renesas R-Car D3, which has an entry-level Imagination PowerVR GE8300 graphics processing unit (GPU) and one ARM Cortex A53 CPU core. Linux operating system.

The example application uses the junior cluster available at https://git.qt.io/public-demos/qt3dstudio/tree/master/LowEndCluster . The low-level cluster example is well optimized.

To make the application as easy as possible, a 3D ADAS view (advanced driver-assistance systems) user interface is created in real time. Other parts of the dashboard are created using Qt Quick. This allows you to have a real-time 3D user interface even on entry-level hardware such as the Renesas R-Car D3.

High level example with NVIDIA Tegra X2

The high-end embedded device used in the measurements is the NVIDIA Jetson TX2 Development Board equipped with the Tegra X2 SoC, which has a 256-core NVIDIA Pascal™ graphics processing unit (GPU) and a dual-core 64-bit NVIDIA Denver 2, as well as a quad-core ARM Cortex-A57 processor MPCore CPU. Linux operating system.

The example application uses the Kria cluster available at https://git.qt.io/public-demos/qt3dstudio/tree/master/kria-cluster-3d-demo . The Kria cluster example is intentionally heavy with large and not fully optimized textures, high resolution, etc.

In the high-end example, all sensors and other elements are rendered in real-time 3D rendered at runtime by Qt 3D Studio. Qt Quick details are very scarce and are carried over to the 3D user interface by sharing textures via QML streams.

Improve rendering performance

The biggest improvement in the new version of Qt 3D Studio 2.4 is rendering performance - using the same application to render more frames per second (Frames Per Second (FPS) on the same hardware. As always with Qt, we strive to maintain stable 60 FPS but on embedded devices there just isn't enough performance.When there are elements like heat management and various other use cases, it usually pays to run at the very edge of the SoC's graphics capabilities.In the case of an application like a "dashboard", performance should be smooth under all operating conditions, including when the system is under maximum load.To achieve the measurement goals on the high-end example, we disabled vsync, which allows the system to draw as many frames as it can.In a typical real-world application, there is always a set of vsync, so everything, that we can exceed 60 FPS means saving processing resources.

The graphs below show the measured frames per second with a high example on an NVIDIA TX2 (vsync off) and a low example on a Renesas R-Car D3 (vsync on):

High level example: With the new Qt 3D Studio 2.4 we see a whopping 565% improvement in rendering performance. With Qt 3D Studio 2.3 the app only ran at 20 FPS, but the new Qt 3D Studio 2.4 allows the app to run at 133 FPS. This is measured by disabling vsync, just to gauge the capabilities of the new runtime. In practice, using 60 FPS is enough, and you can use the extra CPU capacity to have a larger screen (or different screen) or a more complex application, or just not use the SoC's maximum capacity to save power.

Low level example: The improvement is 46% as the max FPS is limited to 60 FPS with Qt Quick. With Qt 3D Studio 2.3 the application was hitting 41 FPS, and with the new 2.4 runtime it easily hits 60 FPS. As with more powerful high-end hardware, the SoC's excess capacity can be used to run a more complex 3D user interface, or simply left unused.

Improved CPU usage

The total CPU usage of an application is the sum of several things, one of which is the load caused by the 3D engine. In embedded applications, it is important that the application's use of 3D does not cause excessive CPU usage. If an application exceeds the allowed CPU capacity, it will not be able to render at the target FPS and stuttering may occur on the screen.

The graphs below show measured CPU load with a high example on NVIDIA TX2 and a low example on Renesas R-Car D3:

High level example: With the new Qt 3D Studio 2.4 we see a significant 51% increase in CPU usage compared to Qt 3D Studio 2.3, while at the same time the FPS increases from 20 FPS to 133 FPS. The total load with Runtime 2.3 is 167% (out of a total of 400%), and with Runtime 2.4 the load drops to 81%. Please note that increased rendering speed also affects CPU usage. If vsync is enabled and FPS is limited to 60 FPS, the CPU usage is 74%.

Low level example: We only see a slight 5% increase in CPU usage, mostly due to the application using Qt Quick, but this is with FPS going from 41 FPS to 60 FPS at the same time. It should also be noted that the R-Car D3 processor is not very powerful, so increasing the FPS of the entire application affects the overall CPU load.

Improve memory usage

Resources for any graphics and especially 3D usually take up most of the RAM (Random Access Memory, RAM). There are optimizations to be made, as well as avoiding unnecessary detail levels and using texture compression. We are not using any specific optimization techniques in this post. The measurements are done with the same application, no other changes other than using different versions of Qt 3D Studio and their runtimes.

The graphs below show measured RAM usage with a high example on the NVIDIA TX2 and a low example on the Renesas R-Car D3:

High level example: In the new Qt 3D Studio 2.4 we see a 48 MB reduction compared to Qt 3D Studio 2.3. This reduces the overall memory usage of the application by 20%.

Low-level example: In a simpler example, the reduction in RAM usage is 9MB using the new version 2.4 runtime. As a percentage, this reduces the overall memory usage of the application by 15%.

How was this achieved?

The improvements are really big, especially for embedded systems, so you might wonder what's changed in the new version? We used the same runtime architecture as in Qt 3D Studio 1.x releases instead of running on top of Qt 3D. The core logic of the 3D engine remains the same as before, but it runs right on top of OpenGL instead of using Qt 3D. This provides significantly improved performance, especially on embedded devices as well as more powerful desktop systems. By running the Studio 3D engine right on top of OpenGL, we avoid rendering overhead and simplify the architecture. The simpler architecture provides less internal signaling, fewer objects in memory, and less need for synchronization between multiple rendering threads. All this allowed us to make further optimizations compared to Qt 3D Studio 1.x and of course introduce new features developed in Qt 3D Studio 2.x releases on top of the OpenGL based runtime.

A change in the 3D runtime does not require any changes for most projects. Just change the import statement (import QtStudio3D.OpenGL 2.4 instead of import QtStudio3D 2.3) and then it will be enough to recompile with the new Qt 3D Studio 2.4. Because the API and parts of the 3D engine related to the application are exactly the same as before (same materials, shaders, etc.) and, accordingly, work the same way. In the rare cases where some modifications are needed, such as for some non-standard materials, they are quite small.

Get Qt 3D Studio 2.4

If you haven't tried the Qt 3D Studio 2.4 pre-releases yet, you should definitely give it a try. There is currently a third beta release, and a final release is due soon, scheduled for release before the end of June. Qt 3D Studio is available under both commercial and open source licenses.

We recommend hosting TIMEWEB
We recommend hosting TIMEWEB
Stable hosting, on which the social network EVILEG is located. For projects on Django we recommend VDS hosting.

Do you like it? Share on social networks!

Comments

Only authorized users can post comments.
Please, Log in or Sign up
AD

C ++ - Test 004. Pointers, Arrays and Loops

  • Result:50points,
  • Rating points-4
m

C ++ - Test 004. Pointers, Arrays and Loops

  • Result:80points,
  • Rating points4
m

C ++ - Test 004. Pointers, Arrays and Loops

  • Result:20points,
  • Rating points-10
Last comments
i
innorwallNov. 14, 2024, 7:03 p.m.
Qt/C++ - Lesson 060. Configuring the appearance of the application in runtime I didnt have an issue work colors priligy dapoxetine 60mg revia cost uk August 3, 2022 Reply
i
innorwallNov. 14, 2024, 12:07 p.m.
Circuit switching and packet data transmission networks Angioedema 1 priligy dapoxetine
i
innorwallNov. 14, 2024, 11:42 a.m.
How to Copy Files in Linux If only females relatives with DZ offspring were considered these percentages were 23 order priligy online uk
i
innorwallNov. 14, 2024, 9:09 a.m.
Qt/C++ - Tutorial 068. Hello World using the CMAKE build system in CLion ditropan pristiq dosing With the Yankees leading, 4 3, Rivera jogged in from the bullpen to a standing ovation as he prepared for his final appearance in Chicago buy priligy pakistan
i
innorwallNov. 14, 2024, 4:05 a.m.
EVILEG-CORE. Using Google reCAPTCHA 2001; 98 29 34 priligy buy
Now discuss on the forum
i
innorwallNov. 14, 2024, 3:39 a.m.
добавить qlineseries в функции priligy amazon canada 93 GREB1 protein GREB1 AB011147 6
i
innorwallNov. 11, 2024, 10:55 a.m.
Всё ещё разбираюсь с кешем. priligy walgreens levitra dulcolax carbs The third ring was found to be made up of ultra relativistic electrons, which are also present in both the outer and inner rings
9
9AnonimOct. 25, 2024, 9:10 a.m.
Машина тьюринга // Начальное состояние 0 0, ,<,1 // Переход в состояние 1 при пустом символе 0,0,>,0 // Остаемся в состоянии 0, двигаясь вправо при встрече 0 0,1,>…

Follow us in social networks