Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ABR transcoding workloads underperform from ill-defined synchronization by input surface #219

Open
dvrogozh opened this issue Jun 9, 2018 · 2 comments

Comments

@dvrogozh
Copy link
Contributor

dvrogozh commented Jun 9, 2018

@XinfengZhang, @xhaihao, @onabiull : for your attention

libva/va/va.h

Line 3568 in b3be72a

VAStatus vaSyncSurface (

This is the only synchronization function defined in VAAPI. For encoders we use input surface as synchronization object. As a result if we have multiple operations scheduled for the input surface all (encoders) will wait for the completion of last submitted operation. We eventually step into few issues with that:

  1. Once first operation completes it actually waits while it can make next operation to start, i.e. we slow down overall processing.
  2. Since all threads are waiting on the same vaSyncSurface they will be awaken at the same time and we actually step into thundering herd since they all will suddenly request CPU time.

Please, change VAAPI to eliminate this issue. There are few ways to fix it:

  1. The one which https://github.com/intel/media-driver follows is to completely avoid vaSyncSurface for encoders and rely on vaMapBuffer instead. I believe we additionally need to do the following in this case:
    1.1. Synchronization via input surface still should be possible for backward compatibility
    1.2. Synchronization by vaMapBuffer should be explicitly allowed and noted in the VAAPI documentation
    1.3. VAAPI should have capability which can be queried: whether encoder supports synchronization by vaMapBuffer or not
  2. Another variant is to have separate call for encoders synchronization. I.e. encoder waits for bitstream to be ready, i.e. vaSyncBuffer(va_bitstream) call should be provided.

The describe issue affects one of the key usage scenario which is ABR transcoding meaning that there is pipeline like:

<decoder of 1920x1080> -> <encoding of 1920x1080 w/ bitrate1>
                      -> <encoding of 1920x1080 w/ bitrate2>
                      -> <vp: downscale to 1280x720> -> encoding
                      -> <vp: downscale to 720x480> -> encoding

This issue is also tracked for mediasdk as a consumer of VAAPI: Intel-Media-SDK/MediaSDK#287.

@XinfengZhang
Copy link
Contributor

add more people in the discussion. @sreerenjb @uartie @mypopydev

@fhvwy
Copy link
Contributor

fhvwy commented Jul 1, 2018

To note another approach to what you have above, you could sync to the reconstructed frame instead (i.e. the output surface of the encoder, rather than the input surface). This was actually how I originally thought it was meant to work, before I ran on drivers other than i965 (http://git.videolan.org/?p=ffmpeg.git;a=commit;h=94f446c628bb885561eb028b0b01362e02ab67f5 - the i965 driver essentially ignores all of this because vaEndPicture() is completely synchronous for encoding there).

I'm in favour of a change to sync on the buffer mapping or to the reconstructed frame (or allow either), since they both make more sense than what we have at the moment. I don't think adding a new call is necessary, but I'm not against it if you have some reason for preferring it.

Making it backward compatible is not very nice, though (if it isn't then we would need a major version bump; i.e. libva 3). I think it can be done by making a sync on the input surface to an encoder in a new driver block until no encoder is using that surface (to match old behaviour; new clients should never touch this surface while an encode is in progress so I don't think it's a problem to enforce that), while also adding a new config attribute for encoders indicating that they support the new behaviour so that we can fall back to the old method on an old (or not yet updated) driver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants