-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better parsing of srt subtitles to remove double newlines/breaks #31
Comments
Another example
|
I'm also having this issue right now, torned between writing my own converter or pre-patching srt file to get rid of these line breaks |
I ended up writing a pre-patch to sanitize my srt files before reading them with webvtt, used a mix of both replace and regex to remove linebreaks and then keep on expanding that regex based on any other format mess I face |
hi @shubhank008 - could you share your replace / regex that you used? running into the same issues! |
I am getting Malformed Exception in some of my srt files due to them having weird double line breaks which breaks your parser I think.
I tried fixing it by replacing 2 or 3 linebreaks with a single linebreak but it wasn't as accurate as regex or a proper approach would be, would appreciate if you can add it.
Example subtitle (part of it)
The text was updated successfully, but these errors were encountered: