Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji processing is very slow #63

Open
GLEB-M opened this issue Nov 25, 2022 · 11 comments
Open

Emoji processing is very slow #63

GLEB-M opened this issue Nov 25, 2022 · 11 comments
Labels

Comments

@GLEB-M
Copy link

GLEB-M commented Nov 25, 2022

FlowDocumentExtensions.cs, SubstituteGlyphs method

        while (cur.CompareTo(range_end) < 0)
        {
            TextPointer next = cur.GetNextInsertionPosition(LogicalDirection.Forward);
            if (next == null)
                break;

            string replace_text = null;
            var replace_range = new TextRange(cur, next);
            if (replace_range.Text.Length > 0 && EmojiData.MatchStart.Contains(replace_range.Text[0]))  

This code is very slow! If only 100 or more characters are printed, lag occurs.

replace_range.Text - getting text from range for each character is not fast operation for richedit
What about optimizing this?

@samhocevar samhocevar added the bug label Jan 2, 2023
samhocevar added a commit that referenced this issue Jan 3, 2023
Try to restrict the range on which operations are performed. There are a lot
of corner cases that may not work properly, but at least it’s a lot faster now.
@samhocevar
Copy link
Owner

I tried improving the substitution logic in febfcdd. Can you maybe give it a try?

@mike-ward
Copy link

Would it make sense to first check the entire string for a known pattern (if one exists) and only when found, then loop through the string?

@mike-ward
Copy link

Another possible option would be to store a hash of strings encountered that have no emoji sequences.

@GLEB-M
Copy link
Author

GLEB-M commented Jan 4, 2023

I tried improving the substitution logic in febfcdd. Can you maybe give it a try?

Initially, I thought that the problem was in rendering, but after debugging, I realized that this was not the case. Iterating through the richedit characters and getting the range is very slow! I spent some time thinking that this can be done simply, but then I realized that this is not the case and I need to radically change the logic for replacing text with emoji. But my attempts were not successful due to the nuances of processing emoji, etc.

@samhocevar
Copy link
Owner

Yes, TextRange is really too high level for what I am doing. I did not realise the approach would be so slow. But I believe several optimisations are possible:

  • perform asynchronous Emoji replacements (won’t help with overall CPU usage, but will improve responsiveness)
  • do not call SubsituteGlyphsInRange on text runs that have already been processed
  • read more than one character at a time when scanning for Emoji in a Run or FlowDocument.

@Swindler95
Copy link

Swindler95 commented Feb 14, 2023

Hi, it's not perfect yet, but I modified the SubstituteGlyphsInRange method so that it no longer iterates through the entire string of characters. It only processes the emojis.

private static readonly Regex emojiRegex = new Regex(EmojiData.MatchOne.ToString(), RegexOptions.None);

internal static void SubstituteGlyphsInRange(TextRange range, double default_font_size, Brush default_foreground, DependencyObject parent, SubstituteOptions options)
{
	// Get the parent RichTextBox
	RichTextBox rtb = parent as RichTextBox;

	// Check if ColonSyntax option is enabled
	var colon_syntax = (options & SubstituteOptions.ColonSyntax) != 0;

	// Check if ColorBlend option is enabled
	var color_blend = (options & SubstituteOptions.ColorBlend) != 0;

	// Get the caret position in the RichTextBox
	TextPointer caret = rtb?.CaretPosition;

	// Get the text within the specified text range
	string text = range.Text;

	// Search for emoji matches in the text using the emojiRegex regular expression
	var matches = emojiRegex.Matches(text);

	// Iterate over each match found
	foreach (Match match in matches)
	{
		// Get the start and end position of the match
		var start = range.Start.GetPositionAtOffset(match.Index);
		var end = start.GetPositionAtOffset(match.Length);

		// Create a text range to replace with the emoji
		var replace_range = new TextRange(start, end);
		
		// Get the text of the emoji
		var replace_text = match.Value;

		// Check if the caret is after the start of the range to be replaced
		bool caret_was_next = caret != null && start.CompareTo(caret) < 0 && end.CompareTo(caret) >= 0;

		// Get the font size and foreground color of the text range to be replaced
		var font_size = replace_range.GetPropertyValue(TextElement.FontSizeProperty);
		var foreground = replace_range.GetPropertyValue(TextElement.ForegroundProperty);

		// Replace the text range with an EmojiInline
		replace_range.Text = "";
		Inline inline = new EmojiInline(start)
		{
			FontSize = (double)(font_size ?? default_font_size),
			Foreground = color_blend ? (Brush)(foreground ?? default_foreground) : Brushes.Black,
			Text = replace_text,
		};

		// If the caret was after the start of the range to be replaced, update its position after the emoji insertion
		if (caret_was_next)
			caret = inline.ContentEnd;
	}

	// If the parent RichTextBox is not null, update the caret position
	if (rtb != null)
		rtb.CaretPosition = caret;
}

The first search for an emoji is a bit slow. But after that, regardless of the size of the text, it remains fast.

@GLEB-M
Copy link
Author

GLEB-M commented Feb 14, 2023

Hi, Swindler95

var matches = emojiRegex.Matches(text); // where is object emojiRegex?

@Swindler95
Copy link

Hi, @GLEB-M

Sorry, I forgot to add the variable:
private static readonly Regex emojiRegex = new Regex(EmojiData.MatchOne.ToString(), RegexOptions.None);

@GLEB-M
Copy link
Author

GLEB-M commented Feb 15, 2023

Swindler95, I tried to use your code, but this doesn't work properly

изображение

@Swindler95
Copy link

@GLEB-M

I said it wasn't perfect 😄, in fact I'm only using this new method for input because I need to be able to type a large amount of text with few emojis, and it works very well. With the old SubstituteGlyphsInRange, the more text there is, the longer it takes to process and it becomes really unusable. However, I've developed a new class called "RichTextBlock" for displaying the text that uses the old SubstituteGlyphsInRange, which is more stable for displaying all emojis correctly.

I'll continue working on improving this method, but if anyone else wants to improve it, I'm open to it. 😉

@TrabacchinLuigi
Copy link

TrabacchinLuigi commented Jun 26, 2023

I did some profiling on this project and i see sostituteglyphinrange is called a ton of times. I was thinking: maybe one improvement could be render a line then cache it, render everything only if the container is resized horizontally (vertically for top down languages?) Not sure what the hlsl nuget package is doing but if we could store those cached bitmaps in graphic card mem, that would be even faster, and memory efficient

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants