Optimise OpenGL renderer by utilizing OpenGL's state machine #8166

aglitchman · 2023-10-20T15:42:54Z

Is your feature request related to a problem? Please describe (REQUIRED):
By learning the OpenGL renderer of the Defold engine, I came to the conclusion that it is designed as if it was created according to the OpenGL handbook (I hope that didn't sound rude!). I mean that any draw call starts with activation (or bind) of an object/texture/program and ends with deactivation (or unbind). Even if the next draw call uses the same texture, the same program, the same matrices, etc.

As we know, OpenGL is stateful. Therefore, you can specify texture parameters once, set constants to the program once, and so on.

How to use this fact? In modern games 50-100 draw calls per frame is becoming normal, because it is impossible to draw shadows, outlines, other effects and objects themselves in 1-3 draw calls. And I have found that in my game for about 100 draw calls the Defold engine makes about 5000 API calls. In a single frame! That may not be so much, but modern OpenGL is:

on desktop runs as a client-server, commands are passed asynchronously. So graphics API calls do not go directly to the graphics card and passing any commands most likely takes extra CPU time.
on web each API call is a long chain of calls: Wasm -> JavaScript -> Browser -> [OpenGL] or [ANGLE -> DirectX] or [ANGLE -> Metal] (depends on the platform).

And I came to the point that we can reduce the number of OpenGL API calls if we:

Do not set the same texture filtering parameters every draw call.
Cache values of uniform constants and update them only when they are really changed. This is especially noticeable when rendering models, when you need to draw 50 models with the same material that has 20 constants with almost the same values.

Describe the solution you'd like (REQUIRED):
Reduce the number of OpenGL API calls by setting texture parameters and constants once, and so on.

aglitchman · 2023-10-20T15:43:11Z

Later I will make a PR where the constants will be cached. To discuss it and to see: this optimization makes sense or it's an overcomplication 🙌🏻

aglitchman · 2023-11-30T12:49:16Z

I found the Emscripten Optimizing WebGL article that describes a similar idea. The sum up: there is an overhead for any OpenGL function call and you need to reuse OpenGL state as much as possible. That is, don't reset buffers and textures after each draw if the same buffers and textures are used in the next draw call. And so on.

https://emscripten.org/docs/optimizing/Optimizing-WebGL.html#avoid-redundant-calls

aglitchman · 2023-11-30T12:54:03Z

By the way, I have already done some tests with 200 draw calls of 3d models (models are behind the screen, not cut off by frustum culling so as not to overload GPU). On a Chromebook ARM, rendering these actually empty 200 draw calls takes 40% of the total frame time. 🤯
After accepting this PR (#8152), the percentage dropped to 32%.

aglitchman added the feature request A suggestion for a new feature label Oct 20, 2023

aglitchman linked a pull request Oct 20, 2023 that will close this issue

Add a local storage of uniform constants values to the OpenGL renderer #8167

Open

14 tasks

britzl added engine Issues related to the Defold engine render labels Oct 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise OpenGL renderer by utilizing OpenGL's state machine #8166

Optimise OpenGL renderer by utilizing OpenGL's state machine #8166

aglitchman commented Oct 20, 2023

aglitchman commented Oct 20, 2023 •

edited

aglitchman commented Nov 30, 2023 •

edited

aglitchman commented Nov 30, 2023

Optimise OpenGL renderer by utilizing OpenGL's state machine #8166

Optimise OpenGL renderer by utilizing OpenGL's state machine #8166

Comments

aglitchman commented Oct 20, 2023

aglitchman commented Oct 20, 2023 • edited

aglitchman commented Nov 30, 2023 • edited

aglitchman commented Nov 30, 2023

aglitchman commented Oct 20, 2023 •

edited

aglitchman commented Nov 30, 2023 •

edited