vo_gpu_next: use emulated formats only as fallback #13682

sfan5 · 2024-03-10T19:21:08Z

how to check: ./build/mpv --vo=gpu-next --msg-level=vf=trace,vo=debug --force-window --idle video.mp4
compare libplacebo gpu formats table to vo format table

fixes mpv-android/mpv-android#855

github-actions · 2024-03-10T19:38:50Z

Download the artifacts for this pull request:

Windows

macOS

kasper93 · 2024-03-10T20:36:45Z

I don't understand this patch. Why does it select gbrpf32 as best format in the first place? This requires software conversion before uploading to gpu.

The fact it is fmt->emulated on libplacebo side is not important in this case at all. It only means that libplacebo internally will upload data differently, in practice there is little to no performance impact. Maybe libplacebo shouldn't expose f32 format or mpv should be smart enough to not select f32 formats, but fmt->emulated is not the right tool to make that distinction.

sfan5 · 2024-03-11T14:35:58Z

These are the relevant infos in that case:

[vo/gpu-next/libplacebo:debug] GPU texture formats:
[vo/gpu-next/libplacebo:debug]     NAME                 TYPE   SIZE COMP CAPS         EMU DEPTH         HOST_BITS     GLSL_TYPE  GLSL_FMT   FOURCC
[vo/gpu-next/libplacebo:debug]     rgba8                UNORM  4    RGBA S-LRbBV--H-- n   {8  8  8  8 } {8  8  8  8 } vec4       rgba8      AB24  
[vo/gpu-next/libplacebo:debug]     r8                   UNORM  1    R    S-LRbBV----- n   {8  0  0  0 } {8  0  0  0 } float      r8         R8    
[vo/gpu-next/libplacebo:debug]     rg8                  UNORM  2    RG   S-LRbBV----- n   {8  8  0  0 } {8  8  0  0 } vec2       rg8        GR88  
[vo/gpu-next/libplacebo:debug]     bgra8                UNORM  4    BGRA S-LRbBV----- n   {8  8  8  8 } {8  8  8  8 } vec4       rgba8      AR24  
[vo/gpu-next/libplacebo:debug]     r8u                  UINT   1    R    S--R-BV----- n   {8  0  0  0 } {8  0  0  0 } uint       r8ui             
[vo/gpu-next/libplacebo:debug]     rg8u                 UINT   2    RG   S--R-BV----- n   {8  8  0  0 } {8  8  0  0 } uvec2      rg8ui            
[vo/gpu-next/libplacebo:debug]     rgba8u               UINT   4    RGBA S--R-BV----- n   {8  8  8  8 } {8  8  8  8 } uvec4      rgba8ui          
[vo/gpu-next/libplacebo:debug]     r16u                 UINT   2    R    S--R-BV----- n   {16 0  0  0 } {16 0  0  0 } uint       r16ui            
[vo/gpu-next/libplacebo:debug]     rg16u                UINT   4    RG   S--R-BV----- n   {16 16 0  0 } {16 16 0  0 } uvec2      rg16ui           
[vo/gpu-next/libplacebo:debug]     rgba16u              UINT   8    RGBA S--R-BV----- n   {16 16 16 16} {16 16 16 16} uvec4      rgba16ui         
[vo/gpu-next/libplacebo:debug]     rgb8                 UNORM  3    RGB  S-LRbBV----- y   {8  8  8  0 } {8  8  8  0 } vec3                  BG24  
[vo/gpu-next/libplacebo:debug]     r16f                 FLOAT  4    R    S-LRbB------ y   {16 0  0  0 } {32 0  0  0 } float      r16f             
[vo/gpu-next/libplacebo:debug]     rg16f                FLOAT  8    RG   S-LRbB------ y   {16 16 0  0 } {32 32 0  0 } vec2       rg16f            
[vo/gpu-next/libplacebo:debug]     rgba16f              FLOAT  16   RGBA S-LRbB------ y   {16 16 16 16} {32 32 32 32} vec4       rgba16f          
[vo/gpu-next/libplacebo:debug]     rgb16f               FLOAT  12   RGB  S-L--------- y   {16 16 16 0 } {32 32 32 0 } vec3                        
[vo/gpu-next/libplacebo:debug]     rgb8u                UINT   3    RGB  S-----V----- y   {8  8  8  0 } {8  8  8  0 } uvec3                       
[vo/gpu-next/libplacebo:debug]     rgb16u               UINT   6    RGB  S-----V----- y   {16 16 16 0 } {16 16 16 0 } uvec3                       
[vo/gpu-next/libplacebo:debug]     r32f                 FLOAT  4    R    ------V----- y   {32 0  0  0 } {32 0  0  0 } float      r32f             
[vo/gpu-next/libplacebo:debug]     rg32f                FLOAT  8    RG   ------V----- y   {32 32 0  0 } {32 32 0  0 } vec2       rg32f            
[vo/gpu-next/libplacebo:debug]     rgb32f               FLOAT  12   RGB  ------V----- y   {32 32 32 0 } {32 32 32 0 } vec3                        
[vo/gpu-next/libplacebo:debug]     rgba32f              FLOAT  16   RGBA ------V----- y   {32 32 32 32} {32 32 32 32} vec4       rgba32f          
[vo/gpu-next:v] Assuming 60.000004 FPS for display sync.
[vf:trace] VO reports supported formats:
[vf:trace]   yuv444p        (2)
[vf:trace]   yuv420p        (2)
[vf:trace]   gray           (2)
[vf:trace]   nv12           (2)
[vf:trace]   argb           (2)
[vf:trace]   bgra           (2)
[vf:trace]   abgr           (2)
[vf:trace]   rgba           (2)
[vf:trace]   bgr24          (1)
[vf:trace]   rgb24          (1)
[vf:trace]   0rgb           (2)
[vf:trace]   bgr0           (2)
[vf:trace]   0bgr           (2)
[vf:trace]   rgb0           (2)
[vf:trace]   yap8           (2)
[vf:trace]   y1             (2)
[vf:trace]   gbrp1          (2)
[vf:trace]   gbrp2          (2)
[vf:trace]   gbrp3          (2)
[vf:trace]   gbrp4          (2)
[vf:trace]   gbrp5          (2)
[vf:trace]   gbrp6          (2)
[vf:trace]   yuv422p        (2)
[vf:trace]   yuv410p        (2)
[vf:trace]   yuv411p        (2)
[vf:trace]   yuvj422p       (2)
[vf:trace]   nv21           (2)
[vf:trace]   yuv440p        (2)
[vf:trace]   yuvj440p       (2)
[vf:trace]   yuva420p       (2)
[vf:trace]   ya8            (2)
[vf:trace]   gbrp           (2)
[vf:trace]   yuva422p       (2)
[vf:trace]   yuva444p       (2)
[vf:trace]   nv16           (2)
[vf:trace]   gbrap          (2)
[vf:trace]   yuvj411p       (2)
[vf:trace]   gbrpf32        (1)
[vf:trace]   gbrapf32       (1)
[vf:trace]   nv24           (2)
[vf:trace]   nv42           (2)
[vf:trace]   vuya           (2)
[vf:trace]   vuyx           (2)

You'd normally map yuv420p10 to 3 planes of r16, but the GPU doesn't support this.
mpv then decides the next "better" format based on which loses the least information. So it arrives at gbrpf32.

There's nothing wrong with this logic in principle IMO.

The fact it is fmt->emulated on libplacebo side is not important in this case at all. It only means that libplacebo internally will upload data differently, in practice there is little to no performance impact.

Dunno about Vulkan but my understanding of this matches what is written in the header:

    // If `emulated` is true, then this format doesn't actually exist on the
    // GPU as an uploadable texture format - and any apparent support is being
    // emulated (typically using compute shaders in the upload path).

As in: it's not about how libplacebo uploads it, but how the graphics driver processes it.

Makes use of the query_format fallback logic introduced in the previous commit. Fallback formats are likely to be extraordinarily slow. Trigger for this issue was an Android device (of course) where there are no reasonable formats to map yuv420p10le and mpv would pick 32-bit float RGB as the next best format, forcing a slow path in both swscale and the GLES driver. With the new logic mpv will convert 8-bit YUV, which enables playback at a reasonable frame rate.

kasper93 · 2024-03-14T20:10:59Z

Dunno about Vulkan but my understanding of this matches what is written in the header:

Yes, it would upload data to texture it little bit roundabout way, but "emulation" part alone shouldn't be that detrimental to performance.

As in: it's not about how libplacebo uploads it, but how the graphics driver processes it.

I've seen this issue some time ago and at the time I was under impression that CPU conversion is to blame. Or more specifically selecting 32f format for it. I understand that rejecting it based on fmt->emulated works, but I'm not sure it is direct reason of the issues and hence the right way to do so.

I'm not sure what to suggest here, because for this I would need to understand where exactly is the bottleneck. If it is really emulated format it would be better to fix on libplacebo and not expose them if they are broken on certain. But if the root cause is different such rejection should be done when it matters.

sfan5 · 2024-03-14T21:15:04Z

I mean you're probably right that the bottleneck is the conversion from yuv420p10 or gbrpf32 and not necessarily the emulated upload.
But in this case the float stuff is the only non-8-bit format, which makes it the obvious target for preventing this from happening in the first place.

maybe @haasn has an opinion?

haasn · 2024-03-14T21:23:13Z

CPU conversion from yuv420p10 to gbrpf32 is absolutely terrible and should be avoided at all costs - considering that this also requires chroma scaling and YUV conversion in addition to very slow floating point path.

It would be far better to pick yuv420p8 and rely on swscale dithering.

kasper93 · 2024-03-15T00:21:04Z

I just tested again on my end, (without this patch).

gpu-next (bad):

V mpv     : [autoconvert:info] Converting yuv420p10 -> gbrpf32
V mpv     : [ffmpeg:v] swscaler: Lanczos scaler, from yuv420p10le to gbrpf32le using C
V mpv     : [vf:v] [out] 3840x2160 gbrpf32 rgb/bt.2020/pq/full/display CL=uhd crop=3840x2160+0+0

gpu (good):

V mpv     : [vo/gpu:v] Reported display depth: 8
V mpv     : [vo/gpu:v] Texture for plane 0: 3840x2160
V mpv     : [vo/gpu:v] Texture for plane 1: 1920x1080
V mpv     : [vo/gpu:v] Texture for plane 2: 1920x1080
V mpv     : [vo/gpu:v] Testing FBO format rgba16f
V mpv     : [vo/gpu:v] Using FBO format rgba16f.

Solution: do what vo_gpu is doing. Clearly it correctly uses rgba16f FBO, instead of gbrpf32 insanity, that forces whole scaling in not optimized C code.

haasn · 2024-03-15T08:40:05Z

I just tested again on my end, (without this patch).

gpu-next (bad):


V mpv     : [autoconvert:info] Converting yuv420p10 -> gbrpf32

V mpv     : [ffmpeg:v] swscaler: Lanczos scaler, from yuv420p10le to gbrpf32le using C

V mpv     : [vf:v] [out] 3840x2160 gbrpf32 rgb/bt.2020/pq/full/display CL=uhd crop=3840x2160+0+0

gpu (good):


V mpv     : [vo/gpu:v] Reported display depth: 8

V mpv     : [vo/gpu:v] Texture for plane 0: 3840x2160

V mpv     : [vo/gpu:v] Texture for plane 1: 1920x1080

V mpv     : [vo/gpu:v] Texture for plane 2: 1920x1080

V mpv     : [vo/gpu:v] Testing FBO format rgba16f

V mpv     : [vo/gpu:v] Using FBO format rgba16f.

Solution: do what vo_gpu is doing. Clearly it correctly uses rgba16f FBO, instead of gbrpf32 insanity, that forces whole scaling in not optimized C code.

That log tells us nothing, as FBO format has nothing to do with the upload texture format. Check the [out] line.

kasper93 · 2024-03-15T09:56:50Z

That log tells us nothing, as FBO format has nothing to do with the upload texture format. Check the [out] line.

gpu:

V mpv     : [vd:v] Using software decoding.
V mpv     : [vd:v] Decoder format: 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/auto CL=uhd crop=3840x2160+0+0
V mpv     : [vd:v] Using container aspect ratio.
V mpv     : [vf:v] [in] 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : [vf:v] [userdeint] 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : [vf:v] [userdeint] (disabled)
V mpv     : [vf:v] [autorotate] 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : [vf:v] [autorotate] (disabled)
V mpv     : [vf:v] [convert] 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : [vf:v] [convert] (disabled)
V mpv     : [vf:v] [out] 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : event: video-reconfig
V mpv     : [cplayer:info] VO: [gpu] 3840x2160 yuv420p10
V mpv     : [cplayer:v] VO: Description: Shader-based GPU Renderer
V mpv     : [vo/gpu:v] reconfig to 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : [vo/gpu:v] Resize: 2992x1344
V mpv     : [vo/gpu:v] Window size: 2992x1344 (Borders: l=0 t=0 r=0 b=0)
V mpv     : [vo/gpu:v] Video source: 3840x2160 (1:1)
V mpv     : [vo/gpu:v] Video display: (0, 0) 3840x2160 -> (301, 0) 2389x1344
V mpv     : [vo/gpu:v] Video scale: 0.622135/0.622222
V mpv     : [vo/gpu:v] OSD borders: l=301 t=0 r=302 b=0
V mpv     : [vo/gpu:v] Video borders: l=301 t=0 r=302 b=0
V mpv     : [vo/gpu:v] Reported display depth: 8
V mpv     : [vo/gpu:v] Texture for plane 0: 3840x2160
V mpv     : [vo/gpu:v] Texture for plane 1: 1920x1080
V mpv     : [vo/gpu:v] Texture for plane 2: 1920x1080
V mpv     : [vo/gpu:v] Testing FBO format rgba16f
V mpv     : [vo/gpu:v] Using FBO format rgba16f.
V mpv     : event: video-reconfig

gpu-next:

V mpv     : [vd:v] Using software decoding.
V mpv     : [vd:v] Decoder format: 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/auto CL=uhd crop=3840x2160+0+0
V mpv     : [vd:v] Using container aspect ratio.
V mpv     : [vf:v] [in] 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : [vf:v] [userdeint] 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : [vf:v] [userdeint] (disabled)
V mpv     : [vf:v] [autorotate] 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : [vf:v] [autorotate] (disabled)
V mpv     : [vf:v] [convert] 3840x2160 yuv420p10 bt.2020-ncl/bt.2020/pq/limited/display CL=uhd crop=3840x2160+0+0
V mpv     : [autoconvert:info] Converting yuv420p10 -> gbrpf32
V mpv     : [ffmpeg:v] swscaler: Lanczos scaler, from yuv420p10le to gbrpf32le using C
V mpv     : [vf:v] [out] 3840x2160 gbrpf32 rgb/bt.2020/pq/full/display CL=uhd crop=3840x2160+0+0
V mpv     : event: video-reconfig

sfan5 · 2024-03-15T10:00:09Z

IIRC the difference here was that vo_gpu would use r16ui for mapping planes while libplacebo apparently can't/doesn't.
Incidentally 10-bit never worked on my phone GPU so maybe there's something that makes the uint formats less suitable for rendering? (although I wholly believe that to just be a driver bug)

haasn · 2024-03-15T10:56:00Z

IIRC the difference here was that vo_gpu would use r16ui for mapping planes while libplacebo apparently can't/doesn't.

Then that sounds like the actual bug here, not the choice of format.

sfan5 · 2024-03-15T15:43:03Z

@haasn #13706

video: support priority order for VO format selection

02eda79

sfan5 force-pushed the fallbackfmt branch from 29d0c08 to e462cf3 Compare March 11, 2024 16:44

sfan5 added the priority:on-ice may be revisited later label Mar 15, 2024

sfan5 marked this pull request as draft March 16, 2024 14:39

sfan5 closed this Jun 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vo_gpu_next: use emulated formats only as fallback #13682

vo_gpu_next: use emulated formats only as fallback #13682

Uh oh!

sfan5 commented Mar 10, 2024

Uh oh!

github-actions bot commented Mar 10, 2024 •

edited

Loading

Uh oh!

kasper93 commented Mar 10, 2024

Uh oh!

sfan5 commented Mar 11, 2024

Uh oh!

kasper93 commented Mar 14, 2024

Uh oh!

sfan5 commented Mar 14, 2024

Uh oh!

haasn commented Mar 14, 2024

Uh oh!

kasper93 commented Mar 15, 2024 •

edited

Loading

Uh oh!

haasn commented Mar 15, 2024 •

edited

Loading

Uh oh!

kasper93 commented Mar 15, 2024

Uh oh!

sfan5 commented Mar 15, 2024 •

edited

Loading

Uh oh!

haasn commented Mar 15, 2024

Uh oh!

sfan5 commented Mar 15, 2024

Uh oh!

Uh oh!

vo_gpu_next: use emulated formats only as fallback #13682

vo_gpu_next: use emulated formats only as fallback #13682

Uh oh!

Conversation

sfan5 commented Mar 10, 2024

Uh oh!

github-actions bot commented Mar 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kasper93 commented Mar 10, 2024

Uh oh!

sfan5 commented Mar 11, 2024

Uh oh!

kasper93 commented Mar 14, 2024

Uh oh!

sfan5 commented Mar 14, 2024

Uh oh!

haasn commented Mar 14, 2024

Uh oh!

kasper93 commented Mar 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haasn commented Mar 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kasper93 commented Mar 15, 2024

Uh oh!

sfan5 commented Mar 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haasn commented Mar 15, 2024

Uh oh!

sfan5 commented Mar 15, 2024

Uh oh!

Uh oh!

github-actions bot commented Mar 10, 2024 •

edited

Loading

kasper93 commented Mar 15, 2024 •

edited

Loading

haasn commented Mar 15, 2024 •

edited

Loading

sfan5 commented Mar 15, 2024 •

edited

Loading