-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suboptimal lowering for list patterns that can be handled by .StartsWith/.EndsWith
#77908
Comments
I don't know why the jit would not be able to optimize this. |
A microbenchmark would help us understand the impact here. It would be useful also to compare with cases where each of the list patterns had the same element patterns except one e.g. |
Given sample code: using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
static class Example
{
static bool Test1(ReadOnlySpan<byte> data)
{
if (data.Length < 8)
return false;
return data
is [0x01, 0x02, 0x03, 0x04, ..]
or [0x05, 0x06, 0x07, 0x08, ..]
or [0x09, 0x0A, 0x0B, 0x0C, ..]
or [0x0D, 0x0E, 0x0F, 0x10, ..]
or [0x11, 0x12, 0x13, 0x14, ..]
or [0x15, 0x16, 0x17, 0x18, ..]
or [0x19, 0x1A, 0x1B, 0x1C, ..];
}
static bool Test2(ReadOnlySpan<byte> data)
{
if (data.Length < 8)
return false;
return
data.StartsWith((ReadOnlySpan<byte>)[0x01, 0x02, 0x03, 0x04]) ||
data.StartsWith((ReadOnlySpan<byte>)[0x05, 0x06, 0x07, 0x08]) ||
data.StartsWith((ReadOnlySpan<byte>)[0x09, 0x0A, 0x0B, 0x0C]) ||
data.StartsWith((ReadOnlySpan<byte>)[0x0D, 0x0E, 0x0F, 0x10]) ||
data.StartsWith((ReadOnlySpan<byte>)[0x11, 0x12, 0x13, 0x14]) ||
data.StartsWith((ReadOnlySpan<byte>)[0x15, 0x16, 0x17, 0x18]) ||
data.StartsWith((ReadOnlySpan<byte>)[0x19, 0x1A, 0x1B, 0x1C]);
}
static bool Test3(ReadOnlySpan<byte> data)
{
if (data.Length < 8)
return false;
return
Match(data, 0x01, 0x02, 0x03, 0x04) ||
Match(data, 0x05, 0x06, 0x07, 0x08) ||
Match(data, 0x09, 0x0A, 0x0B, 0x0C) ||
Match(data, 0x0D, 0x0E, 0x0F, 0x10) ||
Match(data, 0x11, 0x12, 0x13, 0x14) ||
Match(data, 0x15, 0x16, 0x17, 0x18) ||
Match(data, 0x19, 0x1A, 0x1B, 0x1C);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool Match(ReadOnlySpan<byte> bytes, byte a, byte b, byte c, byte d) =>
MemoryMarshal.Read<uint>(bytes) == (a | (b << 8) | (c << 16) | (d << 24));
} Godbolt's disasm: https://godbolt.org/z/3T9naoevs The cost of evaluating early "not match" condition in the original form should be marginally cheaper than the "improved" variant The simulated costs on Skylake (with https://uica.uops.info) are: I understand that this requires Roslyn to make certain assumptions about what the runtime accepting CIL can or cannot do, but it would be nice to just write idiomatic patterns and know that the lowering strategy will pick a close to optimal way to handle them. In the disasm, there are comparison sequences which I think could be merged by the JIT and will submit a separate issue at dotnet/runtime. Thanks. |
Given
it appears that Roslyn compiles such patterns to
This is not efficient as it results in a multitude of byte-by-byte comparisons which are not required here (and no, JIT will not optimize these away because it is a complex set of conditions).
It would be great if Roslyn could emit the following instead:
which results in a much more efficient codegen (and could be a subject to further and better JIT optimization).
They key defining factor is that primitive comparisons up to a machine word size would be more efficient if simply merged together (or even beyond that, to an extent), which is what happens when .StartsWith is called - it's just a single
uint
comparison (even if the codegen is not perfect currently). It is massively cheaper than doing byte-by-byte reads.The text was updated successfully, but these errors were encountered: