Skip to content

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Oct 14, 2025

Contributes to #120407

It looks like even a simple

System.Console("Hello world");

may trigger ICU load unnecessarily via string.ToUpperInvariant calls:

image

Another source of these string.ToUpperInvariants are EventSource (over EventSources' names to generate guids) - we plan to remove those, though.

For most app the ICU most likely will be loaded anyway, but IMO if it doesn't take too much of efforts, we should delay/avoid it. In this PR it is achieved by introducing a few extra null check.

Perf-traces for a hello world:

Also, perf improvements for the InvariantGlobalization mode: EgorBot/runtime-utils#521

@Copilot Copilot AI review requested due to automatic review settings October 14, 2025 01:11
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes string case conversion to avoid unnecessary ICU loading during simple operations like string.ToUpperInvariant() and string.ToLowerInvariant(). The change addresses performance concerns where even basic console output could trigger ICU initialization unnecessarily.

Key changes:

  • Introduced new static methods ToUpperInvariant and ToLowerInvariant in TextInfo that bypass TextInfo.Invariant property access
  • Modified the ChangeCaseCommon methods to accept nullable TextInfo instances, using null to represent invariant culture
  • Updated all call sites to use the new pattern where null TextInfo indicates invariant culture processing

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/libraries/System.Private.CoreLib/src/System/String.Manipulation.cs Updated ToUpperInvariant and ToLowerInvariant to call new static TextInfo methods
src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs Added static ToUpperInvariant/ToLowerInvariant methods and refactored ChangeCaseCommon to accept nullable TextInfo

@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Oct 14, 2025
@EgorBo EgorBo marked this pull request as draft October 14, 2025 02:09
@jkotas jkotas added area-System.Globalization tenet-performance Performance related issue and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Oct 14, 2025
@jkotas
Copy link
Member

jkotas commented Oct 14, 2025

cc @tarekgh

@EgorBo
Copy link
Member Author

EgorBo commented Oct 14, 2025

Ah, it's actually a bit more complicated because bool GlobalizationMode.InvariantNoLoad despite the name actually triggers the ICU (on non BROWSER targets).

It returns Settings.Invariant where Settings has an explicit static cctor explicitly loading the ICU. it probably can be fixed if it won't break the IL linker behavior.

@jkotas
Copy link
Member

jkotas commented Oct 14, 2025

I think you would have to frontload the speculative ascii-only path before any GlobalizationMode.Invariant checks.

@EgorBo EgorBo force-pushed the avoid-icu-string-toupper branch from c6898c0 to 32d8761 Compare October 14, 2025 04:39
@dotnet-policy-service dotnet-policy-service bot added the linkable-framework Issues associated with delivering a linker friendly framework label Oct 14, 2025
@jkotas jkotas self-requested a review October 14, 2025 04:43
@EgorBo EgorBo force-pushed the avoid-icu-string-toupper branch from 9887f4d to d2a96e8 Compare October 14, 2025 09:37
@EgorBo
Copy link
Member Author

EgorBo commented Oct 14, 2025

@EgorBot -amd -arm --envvars DOTNET_SYSTEM_GLOBALIZATION_INVARIANT:1

using BenchmarkDotNet.Attributes;

public class Benchmarks
{
    [Benchmark]
    [Arguments(".NET Runtime uses third-party libraries or other " +
               "resources that may be distributed under licenses " +
               "different than the .NET Runtime software.")]

    [Arguments("runtime uses third-party libraries or other " +
               "resources that may be distributed under licenses")]

    [Arguments("runtime")]
    public string ToLower(string data) => data.ToLowerInvariant();
}

@EgorBo
Copy link
Member Author

EgorBo commented Oct 14, 2025

I think you would have to frontload the speculative ascii-only path before any GlobalizationMode.Invariant checks.

Right, I thought I could just avoid loading ICU at all triggered by if (GlobalizationMode.Invariant) but looks a lot of places rely on this condition to actually load it and crash badly by invoking ICU APIs on a non-loaded ICU.

I've slightly changed the code so the ASCII path is always executed first (for Invariant mode or when IsAsciiCasingSameAsInvariant is true) and if the input is full ASCII ICU won't be loaded for these paths.

Previously we were unconditionally using the special path for invariant mode and that path was quite slow for full ASCII as it was a scalar impl. So presumably my PR improves ASCII inputs for the Invariant mode but regresses non-ASCII ones. I assume the inputs are more likely to be ASCII in the Invariant mode so it sounds like a reasonable trade-off.

@EgorBo
Copy link
Member Author

EgorBo commented Oct 14, 2025

Flamegraphs for before-after for a hello world:

Perf improvements for ASCII only inputs in the InvariantGlobalization mode: EgorBot/runtime-utils#521

@tarekgh
Copy link
Member

tarekgh commented Oct 14, 2025

        public string ToLower(string str)
        {
            ArgumentNullException.ThrowIfNull(str);

            if (GlobalizationMode.Invariant)
            {

Is this check needed anymore?


Refers to: src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs:176 in 5fa74a0. [](commit_id = 5fa74a0, deletion_comment = False)

@tarekgh
Copy link
Member

tarekgh commented Oct 14, 2025

Would this affect how trimming works? I recall that trimming used to depend on GlobalizationMode.Invariant, but that’s no longer as straightforward as before. It would be good to double-check this with the trimming team.

Copy link
Member

@tarekgh tarekgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tarekgh tarekgh added this to the 11.0.0 milestone Oct 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-System.Globalization linkable-framework Issues associated with delivering a linker friendly framework tenet-performance Performance related issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants