Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suboptimal codegen with StartsWith(literal) #110019

Open
MihaZupan opened this issue Nov 20, 2024 · 4 comments
Open

Suboptimal codegen with StartsWith(literal) #110019

MihaZupan opened this issue Nov 20, 2024 · 4 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@MihaZupan
Copy link
Member

When switching from the BitConverter.ToUInt64 hack to the more readable StartsWith(literal), the codegen slightly regresses (doesn't matter for actual perf in this case).

private static readonly ulong s_http10Bytes = BitConverter.ToUInt64("HTTP/1.0"u8);
private static readonly ulong s_http11Bytes = BitConverter.ToUInt64("HTTP/1.1"u8);

private static int Parse1(ReadOnlySpan<byte> line)
{
    if (line.Length < 12 || line[8] != ' ') return 1;

    ulong first8Bytes = BitConverter.ToUInt64(line);
    if (first8Bytes == s_http11Bytes) return 2;
    if (first8Bytes == s_http10Bytes) return 3;

    return 4;
}

private static int Parse2(ReadOnlySpan<byte> line)
{
    if (line.Length < 12 || line[8] != ' ') return 1;

    if (line.StartsWith("HTTP/1.1"u8)) return 2;
    if (line.StartsWith("HTTP/1.0"u8)) return 3;

    return 4;
}

https://www.diffchecker.com/X4PhzKVx/

Patterns like

-cmp       rcx,rax
-jne       short M01_L00
+cmp       rcx,[rax]
+sete      al
+movzx     eax,al
+test      eax,eax
+je        short M01_L00
@MihaZupan MihaZupan added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 20, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Nov 20, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@EgorBo EgorBo added this to the 10.0.0 milestone Nov 20, 2024
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Nov 20, 2024
@EgorBo EgorBo self-assigned this Nov 20, 2024
@jakobbotsch
Copy link
Member

This probably needs some form of late forward substitution. Perhaps we should look into removing single-def single-use locals during rationalization (I have some code lying around to do this...).

@jakobbotsch
Copy link
Member

The problem is that we end up with IR like

x = (len == 4 ? SequenceEqual(a, b, 4) : SequenceEqual(a, b, len));
if (x != 0)
{
  passed:
  // do something
}

and later we are able to show that len == 4 is impossible, but the local sticks around.

It might be that we can teach qmark expansion to do some jump threading directly for cases like this, avoiding the intermediate local. Essentially qmark expansion would need to realize that the x != 0 is a last use of x and then instead of the above pseudocode produce IR corresponding to

if (SequenceEqual(a, b, 4) || SequenceEqual(a, b, len))
  goto passed;
...

It requires liveness, but we do have liveness available when qmark expansion runs. However making use of it might interfere with @AndyAyersMS's future plans to move qmark expansion backwards...

@jakobbotsch
Copy link
Member

Although the problem also happens with DOTNET_JitProfileValues=0, so the qmark thing does not explain it fully. In that case we end up (after inlining StartsWith) with something corresponding to the following:

bool x;
if ("HTTP/1.1"u8.Length > line.Length)
  x = false;
else
  x = SequenceEqual(a, b, c);
if (x)
{
  passed:
  // do something
}

RBO is able to prove that "HTTP/1.1"u8.Length > line.Length is false given the manual test made at the beginning of the function, but we still have that intermediate local around that results in this codegen.

We either need some smarter jump threading here (changing the x = SequenceEqual(a, b, c) to something like if (SequenceEqual(a, b, c)) goto passed;); or we need some form of late forward substitution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

3 participants