H264 ffmpeg with zerolatency but play with green blocks #2424

coconutLatte · 2023-02-24T09:33:57Z

Your environment.

Version: v3.1.55
Browser: Chrome 110.0.5481.177

What did you do?

Hi,
I'm an software engineer working on remote desktop solution in website. Due to the low latency, we chose webrtc as network to transport video.
In server-side encoder, we encode video by ffmpeg, codec is h264, x264opts set tune=zerolatency.
proxy h264 stream like example https://github.com/pion/example-webrtc-applications/tree/master/play-from-disk-h264.
But the website doesn't play normal, screen seems splited into 4 parts transversally, some parts will be green flashing...
I already dump h264 data from encoder, play it by ffplay, it goes well. But website runs not good...

the most magic thing is, if I not set x264opts tune=zerolatency, It shows well in web, if set, got green block.

What did you expect?

video plays well on website

What happened?

video play with green block

The text was updated successfully, but these errors were encountered:

coconutLatte · 2023-02-24T09:34:15Z

this file is h264 data with tune=zerolatency, and play well by ffplay
(github cannot upload .264 file, can download it and play by ffplay)

Sean-Der · 2023-02-24T19:06:04Z

Hi @coconutLatte !

Would you mind sharing the exact ffmpeg command you ran? I will reproduce + fix against the example-webrtc-applications then

coconutLatte · 2023-02-25T06:42:29Z

Hi @Sean-Der !
Thanks for replay.

First of all, I'm not using ffmpeg command to encode it. I encode it by C code
av_opt_set(context->priv_data, "tune", "zerolatency", 0);
But I can recurrence the zerolatency problem by using ffmpeg command.

There are two h264 file below video_raw.264 and video_zerolatency.264 (github not support 264 ext. I add .jpg ext, please delete it)

video_zerolatency.264 is converted from video_raw.264 by running ffmpeg command line ffmpeg -i video_raw.264 -vcodec h264 -tune zerolatency video_zerolatency.264

using video_raw.264 to play in https://github.com/pion/example-webrtc-applications/tree/master/play-from-disk-h264 is normal.

using video_zerolatency.264 to play in https://github.com/pion/example-webrtc-applications/tree/master/play-from-disk-h264 is full of green blocks. like below

coconutLatte · 2023-02-27T05:57:42Z

Hi @Sean-Der !
There are something new we found just now.

When turn on tune=zerolatency in ffmpeg, the I frame being splited in to slices, which start with 00 00 01.

But in pion/rtp/codecs/h264_packet.go (version what pion/webrtc v3.1.55 import is v1.7.13) fulfill h264 packet always add prefix 00 00 00 01 (func annexbNALUStartCode)

maybe when h264reader read one nal, we should know finally packet it with 00 00 00 01 or 00 00 01 in head
I don't know if this matters? raw messge is 00 00 01 but change it to 00 00 00 01

Fruneng · 2023-02-27T12:45:08Z

Does it caused by RTP STAP-A Packet in Non-interleaved mode?
https://www.rfc-editor.org/rfc/rfc6184#section-5.4

retamia · 2023-03-14T07:05:21Z

FFmpeg tune zerolatency will enable multi-slice encoding, and one frame of image will have many h264 nal units. H.264 stream example In the example code, each nal unit sends an rtp packet, and the rtp timestamp will be added with a duration, which causes the image not the same frame when decoding

CastriOnlive · 2023-05-12T13:43:43Z

Is there any fix on this, i have tested many configurations but if activating -tune zerolatency i don't know how to decode later the Nal units to avoid the green screen, has anyone achieved this with pion. Is vp8 the only alternative for zerolatency?

retamia · 2023-06-25T07:34:53Z

@VictorCPH ((nal.Data[1] & 0x80) >> 7) == 1is the slice start flag, which means that the subsequent slices are all in the same frame, until the next slice start flag. Checking the slice start flag will make your code more robust.

trey-hakanson-skydio · 2024-04-05T19:47:55Z

@Sean-Der I'm running into a similar issue, and would appreciate any tips on how to debug! I'm using this sample mp4 file, but I've seen the same issue with any I've tried so it shouldn't really matter. If I use the command mentioned in play-from-disk-h264, it everything works as expected:

ffmpeg -i $INPUT_FILE -an -c:v libx264 -bsf:v h264_mp4toannexb -b:v 2M -max_delay 0 -bf 0 output.h264

But, when I add -tune zerolatency, I encounter the issue described above:

ffmpeg -i $INPUT_FILE -an -c:v libx264 -bsf:v h264_mp4toannexb -b:v 2M -max_delay 0 -bf 0 -tune zerolatency output.h264

Digging into the generated h264 files, the only real difference I see is that when using zerolatency multiple NALUs of type 5 (IDR slice) are generated, as opposed to 1 otherwise:

ffmpeg	ffmepg with zerolatency

My initial thought was that the following logic in pion/rtp would be problematic if the sample provided to WriteSample did not include the SPS, PPS, and all IDR slices so they could all be included in the same RTP packet. But after reading rfc6184 5.2 more closely, I'm less convinced I'm on the right track. pion/rtp loads the SPS and PPS into the STAP-A NALU and then sends the IDR slices as separate FU-A NALUs, which should be fine and handled by the browser's decoder. Sending all the NALUs corresponding to the iframe does at least cause them to have the same RTP timestamp, which helped a little: the browser rendered a full iframe with no green bars, but didn't start playing back based on non-IDR NALUs afterwards.

Still digging into this on my end, but would appreciate any thoughts 🙂

basicfu · 2024-04-06T16:08:04Z

Hello, have you resolved this issue? Currently, I also need zero latency remote desktop stuck here @coconutLatte

trey-hakanson-skydio · 2024-04-07T05:30:03Z

Ok I think I've figured it out; after staring at the NALs for a while, I realized that tune=zerolatency doesn't just affect the IDR slices, it also affects the non-IDR slices. The slicing seems to be consistent at the frame level: if I have 10 IDR slices for a frame, I will also have 10 non-IDR slices in each frame before the next I frame. There's definitely a more intelligent way to do this based on some metadata in the NALUs (maybe frame_num in the slice header?), but to validate that this was the right idea I did some naive buffering to ensure all NALUs from the same frame were in the same sample. Without doing this buffering, the ticker logic in the example doesn't work quite right because you're sending n NALUs of the same frame with different RTP timestamps, and the browser can't figure out how to decode them. I've included the janky buffer logic I used below. I used H264 Naked to look at the NALUs in my sample file and determine 10 was the correct NALUs per frame. I'm going to see if I can get something more robust working based on NALU metadata.

diff --git a/play-from-disk-h264/main.go b/play-from-disk-h264/main.go
index cd05bcc..17f9bfe 100644
--- a/play-from-disk-h264/main.go
+++ b/play-from-disk-h264/main.go
@@ -10,6 +10,7 @@ package main
 import (
 	"context"
 	"errors"
+	"flag"
 	"fmt"
 	"io"
 	"os"
@@ -23,13 +24,24 @@ import (
 )
 
 const (
-	audioFileName     = "output.ogg"
-	videoFileName     = "output.h264"
 	oggPageDuration   = time.Millisecond * 20
 	h264FrameDuration = time.Millisecond * 33
 )
 
+var (
+	audioFileName     string
+	videoFileName     string
+	sessionDescriptor string
+	nalsPerSample     int
+)
+
 func main() { //nolint
+	flag.StringVar(&audioFileName, "ain", "output.ogg", "audio file to process")
+	flag.StringVar(&videoFileName, "vin", "output.h264", "video file to process")
+	flag.StringVar(&sessionDescriptor, "sd", "", "session descriptor")
+	flag.IntVar(&nalsPerSample, "nps", 1, "session descriptor")
+	flag.Parse()
+
 	// Assert that we have an audio or video file
 	_, err := os.Stat(videoFileName)
 	haveVideoFile := !os.IsNotExist(err)
@@ -41,6 +53,11 @@ func main() { //nolint
 		panic("Could not find `" + audioFileName + "` or `" + videoFileName + "`")
 	}
 
+	// Assert that we have a session descriptor
+	if sessionDescriptor == "" {
+		panic("Session descriptor must be provided")
+	}
+
 	// Create a new RTCPeerConnection
 	peerConnection, err := webrtc.NewPeerConnection(webrtc.Configuration{
 		ICEServers: []webrtc.ICEServer{
@@ -62,7 +79,10 @@ func main() { //nolint
 
 	if haveVideoFile {
 		// Create a video track
-		videoTrack, videoTrackErr := webrtc.NewTrackLocalStaticSample(webrtc.RTPCodecCapability{MimeType: webrtc.MimeTypeH264}, "video", "pion")
+		videoTrack, videoTrackErr := webrtc.NewTrackLocalStaticSample(webrtc.RTPCodecCapability{
+			MimeType:    webrtc.MimeTypeH264,
+			SDPFmtpLine: "level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f",
+		}, "video", "pion")
 		if videoTrackErr != nil {
 			panic(videoTrackErr)
 		}
@@ -106,17 +126,36 @@ func main() { //nolint
 			// * avoids accumulating skew, just calling time.Sleep didn't compensate for the time spent parsing the data
 			// * works around latency issues with Sleep (see https://github.com/golang/go/issues/44343)
 			ticker := time.NewTicker(h264FrameDuration)
-			for ; true; <-ticker.C {
-				nal, h264Err := h264.NextNAL()
-				if errors.Is(h264Err, io.EOF) {
-					fmt.Printf("All video frames parsed and sent")
-					os.Exit(0)
-				}
-				if h264Err != nil {
-					panic(h264Err)
+			for {
+				<-ticker.C
+
+				nals := 0
+				buffer := make([]byte, 0)
+
+				for nals < nalsPerSample {
+					nal, h264Err := h264.NextNAL()
+					if errors.Is(h264Err, io.EOF) {
+						fmt.Printf("All video frames parsed and sent")
+						os.Exit(0)
+					} else if h264Err != nil {
+						panic(h264Err)
+					}
+
+					if nal.UnitType == h264reader.NalUnitTypeSPS {
+						// no-op
+					} else if nal.UnitType == h264reader.NalUnitTypePPS {
+						// no-op
+					} else {
+						nals += 1
+					}
+
+					if len(buffer) != 0 { // append start code as delimiter after first NAL
+						buffer = append(buffer, []byte{0, 0, 1}...)
+					}
+					buffer = append(buffer, nal.Data...)
 				}
 
-				if h264Err = videoTrack.WriteSample(media.Sample{Data: nal.Data, Duration: h264FrameDuration}); h264Err != nil {
+				if h264Err = videoTrack.WriteSample(media.Sample{Data: buffer, Duration: h264FrameDuration}); h264Err != nil {
 					panic(h264Err)
 				}
 			}
@@ -218,7 +257,7 @@ func main() { //nolint
 
 	// Wait for the offer to be pasted
 	offer := webrtc.SessionDescription{}
-	signal.Decode(signal.MustReadStdin(), &offer)
+	signal.Decode(sessionDescriptor, &offer)
 
 	// Set the remote SessionDescription
 	if err = peerConnection.SetRemoteDescription(offer); err != nil {

Edit: looking at metadata in NALU slice header doesn't really seem feasible: where frame_num will be in the slice header seems to depend on a lot of things like active SPS/PPS, profile, etc. Parsing is probably best left to the decoder. In my case, I'm getting the NALUs from the encoder on a per frame basis, along with a frame number, which should be enough to munge the RTP timestamp on the sample.

kevmo314 · 2024-04-18T12:45:25Z

Have you tried disabling sliced-threads? I ran across something very similar a while back ago: https://groups.google.com/g/discuss-webrtc/c/3tLWL9yyjsA

Sean-Der added bug Something isn't working difficulty:hard labels May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

H264 ffmpeg with zerolatency but play with green blocks #2424

H264 ffmpeg with zerolatency but play with green blocks #2424

coconutLatte commented Feb 24, 2023

coconutLatte commented Feb 24, 2023 •

edited

Sean-Der commented Feb 24, 2023

coconutLatte commented Feb 25, 2023 •

edited

coconutLatte commented Feb 27, 2023

Fruneng commented Feb 27, 2023

retamia commented Mar 14, 2023 •

edited

CastriOnlive commented May 12, 2023 •

edited

retamia commented Jun 25, 2023

trey-hakanson-skydio commented Apr 5, 2024 •

edited

basicfu commented Apr 6, 2024

trey-hakanson-skydio commented Apr 7, 2024 •

edited

kevmo314 commented Apr 18, 2024

H264 ffmpeg with zerolatency but play with green blocks #2424

H264 ffmpeg with zerolatency but play with green blocks #2424

Comments

coconutLatte commented Feb 24, 2023

Your environment.

What did you do?

What did you expect?

What happened?

coconutLatte commented Feb 24, 2023 • edited

Sean-Der commented Feb 24, 2023

coconutLatte commented Feb 25, 2023 • edited

coconutLatte commented Feb 27, 2023

Fruneng commented Feb 27, 2023

retamia commented Mar 14, 2023 • edited

CastriOnlive commented May 12, 2023 • edited

retamia commented Jun 25, 2023

trey-hakanson-skydio commented Apr 5, 2024 • edited

basicfu commented Apr 6, 2024

trey-hakanson-skydio commented Apr 7, 2024 • edited

kevmo314 commented Apr 18, 2024

coconutLatte commented Feb 24, 2023 •

edited

coconutLatte commented Feb 25, 2023 •

edited

retamia commented Mar 14, 2023 •

edited

CastriOnlive commented May 12, 2023 •

edited

trey-hakanson-skydio commented Apr 5, 2024 •

edited

trey-hakanson-skydio commented Apr 7, 2024 •

edited