-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
H264 ffmpeg with zerolatency but play with green blocks #2424
Comments
Hi @coconutLatte ! Would you mind sharing the exact ffmpeg command you ran? I will reproduce + fix against the example-webrtc-applications then |
Hi @Sean-Der ! First of all, I'm not using ffmpeg command to encode it. I encode it by C code There are two h264 file below
using using |
Hi @Sean-Der ! When turn on But in pion/rtp/codecs/h264_packet.go (version what pion/webrtc v3.1.55 import is v1.7.13) fulfill h264 packet always add prefix maybe when h264reader read one nal, we should know finally packet it with |
Does it caused by RTP STAP-A Packet in Non-interleaved mode? |
FFmpeg tune zerolatency will enable multi-slice encoding, and one frame of image will have many h264 nal units. H.264 stream example In the example code, each nal unit sends an rtp packet, and the rtp timestamp will be added with a duration, which causes the image not the same frame when decoding |
Is there any fix on this, i have tested many configurations but if activating -tune zerolatency i don't know how to decode later the Nal units to avoid the green screen, has anyone achieved this with pion. Is vp8 the only alternative for zerolatency? |
@VictorCPH |
@Sean-Der I'm running into a similar issue, and would appreciate any tips on how to debug! I'm using this sample mp4 file, but I've seen the same issue with any I've tried so it shouldn't really matter. If I use the command mentioned in ffmpeg -i $INPUT_FILE -an -c:v libx264 -bsf:v h264_mp4toannexb -b:v 2M -max_delay 0 -bf 0 output.h264 But, when I add ffmpeg -i $INPUT_FILE -an -c:v libx264 -bsf:v h264_mp4toannexb -b:v 2M -max_delay 0 -bf 0 -tune zerolatency output.h264 Digging into the generated h264 files, the only real difference I see is that when using
My initial thought was that the following logic in pion/rtp would be problematic if the sample provided to WriteSample did not include the SPS, PPS, and all IDR slices so they could all be included in the same RTP packet. But after reading rfc6184 5.2 more closely, I'm less convinced I'm on the right track. pion/rtp loads the SPS and PPS into the STAP-A NALU and then sends the IDR slices as separate FU-A NALUs, which should be fine and handled by the browser's decoder. Sending all the NALUs corresponding to the iframe does at least cause them to have the same RTP timestamp, which helped a little: the browser rendered a full iframe with no green bars, but didn't start playing back based on non-IDR NALUs afterwards. Still digging into this on my end, but would appreciate any thoughts 🙂 |
Hello, have you resolved this issue? Currently, I also need zero latency remote desktop stuck here @coconutLatte |
Ok I think I've figured it out; after staring at the NALs for a while, I realized that diff --git a/play-from-disk-h264/main.go b/play-from-disk-h264/main.go
index cd05bcc..17f9bfe 100644
--- a/play-from-disk-h264/main.go
+++ b/play-from-disk-h264/main.go
@@ -10,6 +10,7 @@ package main
import (
"context"
"errors"
+ "flag"
"fmt"
"io"
"os"
@@ -23,13 +24,24 @@ import (
)
const (
- audioFileName = "output.ogg"
- videoFileName = "output.h264"
oggPageDuration = time.Millisecond * 20
h264FrameDuration = time.Millisecond * 33
)
+var (
+ audioFileName string
+ videoFileName string
+ sessionDescriptor string
+ nalsPerSample int
+)
+
func main() { //nolint
+ flag.StringVar(&audioFileName, "ain", "output.ogg", "audio file to process")
+ flag.StringVar(&videoFileName, "vin", "output.h264", "video file to process")
+ flag.StringVar(&sessionDescriptor, "sd", "", "session descriptor")
+ flag.IntVar(&nalsPerSample, "nps", 1, "session descriptor")
+ flag.Parse()
+
// Assert that we have an audio or video file
_, err := os.Stat(videoFileName)
haveVideoFile := !os.IsNotExist(err)
@@ -41,6 +53,11 @@ func main() { //nolint
panic("Could not find `" + audioFileName + "` or `" + videoFileName + "`")
}
+ // Assert that we have a session descriptor
+ if sessionDescriptor == "" {
+ panic("Session descriptor must be provided")
+ }
+
// Create a new RTCPeerConnection
peerConnection, err := webrtc.NewPeerConnection(webrtc.Configuration{
ICEServers: []webrtc.ICEServer{
@@ -62,7 +79,10 @@ func main() { //nolint
if haveVideoFile {
// Create a video track
- videoTrack, videoTrackErr := webrtc.NewTrackLocalStaticSample(webrtc.RTPCodecCapability{MimeType: webrtc.MimeTypeH264}, "video", "pion")
+ videoTrack, videoTrackErr := webrtc.NewTrackLocalStaticSample(webrtc.RTPCodecCapability{
+ MimeType: webrtc.MimeTypeH264,
+ SDPFmtpLine: "level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f",
+ }, "video", "pion")
if videoTrackErr != nil {
panic(videoTrackErr)
}
@@ -106,17 +126,36 @@ func main() { //nolint
// * avoids accumulating skew, just calling time.Sleep didn't compensate for the time spent parsing the data
// * works around latency issues with Sleep (see https://github.com/golang/go/issues/44343)
ticker := time.NewTicker(h264FrameDuration)
- for ; true; <-ticker.C {
- nal, h264Err := h264.NextNAL()
- if errors.Is(h264Err, io.EOF) {
- fmt.Printf("All video frames parsed and sent")
- os.Exit(0)
- }
- if h264Err != nil {
- panic(h264Err)
+ for {
+ <-ticker.C
+
+ nals := 0
+ buffer := make([]byte, 0)
+
+ for nals < nalsPerSample {
+ nal, h264Err := h264.NextNAL()
+ if errors.Is(h264Err, io.EOF) {
+ fmt.Printf("All video frames parsed and sent")
+ os.Exit(0)
+ } else if h264Err != nil {
+ panic(h264Err)
+ }
+
+ if nal.UnitType == h264reader.NalUnitTypeSPS {
+ // no-op
+ } else if nal.UnitType == h264reader.NalUnitTypePPS {
+ // no-op
+ } else {
+ nals += 1
+ }
+
+ if len(buffer) != 0 { // append start code as delimiter after first NAL
+ buffer = append(buffer, []byte{0, 0, 1}...)
+ }
+ buffer = append(buffer, nal.Data...)
}
- if h264Err = videoTrack.WriteSample(media.Sample{Data: nal.Data, Duration: h264FrameDuration}); h264Err != nil {
+ if h264Err = videoTrack.WriteSample(media.Sample{Data: buffer, Duration: h264FrameDuration}); h264Err != nil {
panic(h264Err)
}
}
@@ -218,7 +257,7 @@ func main() { //nolint
// Wait for the offer to be pasted
offer := webrtc.SessionDescription{}
- signal.Decode(signal.MustReadStdin(), &offer)
+ signal.Decode(sessionDescriptor, &offer)
// Set the remote SessionDescription
if err = peerConnection.SetRemoteDescription(offer); err != nil { Edit: looking at metadata in NALU slice header doesn't really seem feasible: where |
Have you tried disabling |
Your environment.
What did you do?
Hi,
I'm an software engineer working on remote desktop solution in website. Due to the low latency, we chose webrtc as network to transport video.
In server-side encoder, we encode video by ffmpeg, codec is
h264
, x264opts settune=zerolatency
.proxy h264 stream like example https://github.com/pion/example-webrtc-applications/tree/master/play-from-disk-h264.
But the website doesn't play normal, screen seems splited into 4 parts transversally, some parts will be green flashing...
I already dump h264 data from encoder, play it by
ffplay
, it goes well. But website runs not good...the most magic thing is, if I not set x264opts
tune=zerolatency
, It shows well in web, if set, got green block.What did you expect?
video plays well on website
What happened?
video play with green block
The text was updated successfully, but these errors were encountered: