Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hoist preference enums to shorten imports #5915

Open
sffc opened this issue Dec 18, 2024 · 20 comments
Open

Hoist preference enums to shorten imports #5915

sffc opened this issue Dec 18, 2024 · 20 comments
Assignees
Labels
C-locale Component: Locale identifiers, BCP47 S-small Size: One afternoon (small bug fix or enhancement)

Comments

@sffc
Copy link
Member

sffc commented Dec 18, 2024

Currently if you want to use HourCycle you need to do

use icu_locale_core::preferences::extensions::unicode::keywords::HourCycle;

which is a big much.

I'd be fine with icu::locale::preferences::HourCycle, Or if you want to namespace it with unicode, icu::locale::unicode::HourCycle? I think at least "extensions" and "keywords" are unnecessary components of the module import path for most users. Actually I'd be fine also with icu::locale::enums::HourCycle.

@zbraniecki said:

hmm, I think the taxonomy I had in mind was focused on ability to introduce other preferences reflecting:

  • non extensions (for e.g. TimePattern)
  • other extensions (for e.g. transform)

I can see an argument that those are not going to come in any reasonable future and if they do, they will likely either be more esoteric (so no exposition on top level) or won't collide in names.
As a result, I'm ok with shortening the path to it, I'd suggest keeping the sturucture but exposing all extensions::unicode::* on icu::locale::preferences. How does it sound to you?

I don't like "enums" - they're a Rust type, so it feels like "icu::locale::vectors::*".
I'm ok with icu::locale::preferences::HourCycle

@sffc sffc added C-locale Component: Locale identifiers, BCP47 discuss-priority Discuss at the next ICU4X meeting labels Dec 18, 2024
@sffc sffc added this to the ICU4X 2.0 ⟨P1⟩ milestone Dec 18, 2024
@Manishearth
Copy link
Member

  • @sffc these imports are annoying to users
  • @Manishearth they annoy me too, I know that HourCycle is a unicode extension keyword but it's a pain to remember that I need to put it under preferences/extensions/unicode/keywords
  • @zbraniecki I'm aligned with the simplification. I want to point out 2 types of extensions, an AST extension and the strongly typed extension. Like Value vs HourCycle. My main intention was prevention of collision. Value exists in multiple sub-modules.
  • @sffc My argument only applies to the strongly typed extensions. They are enums. They are defined in their own table in UTS 35 and we can put that table anywhere.
  • @zbraniecki Makes sense.
  • @Manishearth We should be careful about collision between Value types. That might need more scrutiny. So maybe we want a holistic design.
  • @sffc I was coding yesterday in Conformance where serde_json::Value was already imported, so I had to qualify our extensions::unicode::Value. But, I think these are niche types. I don't think we need to decide the answer for both the strongly typed and the AST types at the same time.
  • @zbraniecki Not sure if I'm fully aligned on module qualification being a bad thing.

What should the namespace be called?

  1. icu::locale::preferences::HourCycle
  2. icu::locale::unicode::HourCycle
  3. icu::locale::enums::HourCycle
  4. icu::locale::HourCycle

Discussion:

  • @zbraniecki A bit woried about collisions. But, the use case for these are in options bags. So if there were 2 different emoji or transform, it would collide not only here but also elsewhere. So I hope that across all of the i18n industry, we try to keep the preference unique. But, in ECMA-402 we already have cases where the key of the option bag has a different meaning depending on the class it's used on.
  • @zbraniecki Should we call it prefs instead of preferences?
  • @zbraniecki Or can we put it under ::u? It gives me more confidence that we wouldn't collide.
  • @sffc I don't think it's worth harming the user experience now to protect against future name collisions. I think such collisions would be unlikely, and if they occur, we can just give them a different name.
  • @sffc As far as prefs vs preferences, we already had a bikeshed ballot that chose preferences.
  • @sffc We used the enums naming in icu::datetime::fieldsets and I think that is clear naming. It seems like a good place to go.
  • @Manishearth To go over pros and cons: top-level is fine but cluttered. enums seems fine. I don't like unicode because users shouldn't see that in an identifier; seems inside baseball. Everything we do is "unicode". preferences and enums is also fine.
  • @zbraniecki: naming modules after the incedental structures we use is an antipattern imo, so not enums. Half are not enums, they're structs. Prefer preferences::HourCycle.
  • @sffc I'd prefer enums or types or something but preferences is OK
  • @zbraniecki I think preferences is a better user experience.

Conclusion: icu::locale::preferences::HourCycle

LGTM: @Manishearth @zbraniecki (@sffc)

@Manishearth Manishearth removed the discuss-priority Discuss at the next ICU4X meeting label Dec 19, 2024
@Manishearth Manishearth added the S-small Size: One afternoon (small bug fix or enhancement) label Jan 7, 2025
@sffc
Copy link
Member Author

sffc commented Jan 13, 2025

In #5950, @jedel1043 proposed re-exporting the enums from the crates where they are used.

I'm not necessarily opposed to this, but it does have various other implications, and we should verify as a TC whether we want to do this or not.

@sffc sffc added the discuss-priority Discuss at the next ICU4X meeting label Jan 13, 2025
@jedel1043
Copy link
Contributor

jedel1043 commented Jan 13, 2025

I think it's just more convenient for users to have all structs related to a bag of settings on a well known location. icu_locale_core::preferences::extensions::unicode::keywords::CollationCaseFirst is a mouthful, and I don't expect users to know that the preference option they need is specifically on icu_locale_core.

@zbraniecki
Copy link
Member

I think the idea is that you'll just do icu::locale::preferences::CollactionCaseFirst which is less mouthful.

@jedel1043
Copy link
Contributor

Hmmm, maybe that's enough then. I'm slightly worried that we are forcing users of the standalone icu_collator to have to add icu_locale_core as a dependency just to pull a config struct from it, but I guess they'll probably have that dependency imported already, since it's a core crate.

@sffc
Copy link
Member Author

sffc commented Jan 14, 2025

Various factors that make me slightly in favor of the re-export:

  • With the new preferences structs, clients no longer need to always depend on icu_locale_core like they used to
  • The enums are already protected under icu_locale_core's semantic versioning policy, which is the same as in all of the other component crates, so I don't think there's a semver risk to the re-export?
  • Less risk of version drift: the version of the enum you get from icu_collator is always the right one, even if you have a dependency on a different (incompatible) version of icu_locale_core

Caveats:

  • Raises questions about whether the enums should be renamed (icu_collation::CollationCaseFirst seems a bit repetitive?).

I am still in favor of moving the enums within icu_locale_core regardless of whether we re-export from icu_collator and elsewhere.

@Manishearth Manishearth self-assigned this Jan 14, 2025
@zbraniecki
Copy link
Member

Is there any risk that you'd end up with the same preference from different minor revisions of icu_locale_core when vendored from icu_collator vs icu_datetime for example?

@Manishearth
Copy link
Member

No, it won't be possible to have different minor versions of icu_locale_core in tree.

@zbraniecki
Copy link
Member

ok, then I'm mildly supportive of re-exporting to avoid asking devs to add icu_locale_core to Cargo.toml

@sffc
Copy link
Member Author

sffc commented Jan 14, 2025

Ok, so if we re-export them, do we re-export with the same names as they have in icu_locale_core, or shall we rename them on a case-by-case basis such as CollationCaseFirst becoming icu_collator::CaseFirst as proposed in #5950?

@Manishearth
Copy link
Member

Case by case renamings to remove prefixes seems good. We don't need to e.g. rename HourCycle.

@sffc
Copy link
Member Author

sffc commented Jan 16, 2025

@robertbastian points out that this can eventually clutter the icu_collator namespace and docs page.

Can we export these in a preferences module? That seems clean and transferable.

icu_collator::preferences::CollationCaseFirst

And maybe also move CollatorPreferences into that module?

@Manishearth
Copy link
Member

Yes, that works for me.

@Manishearth
Copy link
Member

In general I have a slight preference for having preferences live at the toplevel where it's not too cluttery. But I'm okay with being consistent too.

@robertbastian
Copy link
Member

I'm happy with

use icu_locale_core::preferences::unicode::HourCycle;

The extensions level can go, because everything in preferences is an extension. The keywords level can also go, because everything in there is a keyword (and the error contains an error for these things, that's fine). I strongly want to keep the unicode level, because this mirrors the extensions::unicode hierarchy, and in the future we might want to introduce strongly typed transform preferences, which would then go in preferences::transform (like we have extensions::transform). I also disagree with the proposal to drop the preferences level, because we do also have the extensions level, and I see preferences as a strongly typed version of that.

@sffc
Copy link
Member Author

sffc commented Jan 21, 2025

I see the enums like HourCycle as being floating concepts in the dictionary with no real owner.

Unicode Locales have a Unicode Extension Keyword -u-hc, a serialized representation of HourCycle. It is a client of the enum HourCycle.

It follows that every crate that uses this enum should export it, ideally under the same name. Since preferences is the name we've given to type-safe locales, it makes sense that it is exported as icu::locale::preferences::HourCycle and also as icu::datetime::preferences::HourCycle.

icu::locale::preferences::unicode is redundant as a namespace because both icu (International Components for Unicode) and locale (Unicode Locale) already indicate that everything in this crate is in the Unicode namespace.

@robertbastian
Copy link
Member

icu::locale::preferences::unicode is redundant as a namespace because both icu (International Components for Unicode) and locale (Unicode Locale) already indicate that everything in this crate is in the Unicode namespace.

Do you also want to dissolve the icu::locale::extensions::unicode module, or do you want to keep it inconsistent?

@sffc
Copy link
Member Author

sffc commented Jan 21, 2025

The unicode in icu::locale::extensions::unicode unambiguously refers to the "Unicode BCP-47 U Extension" https://unicode.org/reports/tr35/tr35.html#u_Extension.

There is no such thing as a "Unicode BCP-47 U Preference".

@sffc
Copy link
Member Author

sffc commented Jan 21, 2025

(note: I may support changing extensions::unicode to extensions::u but that isn't being proposed in this issue)

@Manishearth
Copy link
Member

  • @Manishearth We shouldn't nest more than 4 levels deep for things we expect people to import. I see the appeal of sticking things in a submodule, but I think it is not necessary.
  • @Manishearth In component crates, the preferences module is unlikely to get cluttered; it contains one or two preference structs and one or two enums. I don't think a submodule there is necessary.
  • @Manishearth I think a re-export is nice because some of these are shared between crates. I don't want users to be confused between one HourCycle and another HourCycle. I think they definitely should be shared, but not necessarily re-exported. If there are two different hour cycle types, they should have different names.
  • @sffc I think the three questions where we have disagreement in the WG is:
    1. Whether the "source of truth" for the enums is icu::locale::preferences::HourCycle or icu::locale::preferences::something::else::HourCycle
    2. Whether the enums are re-exported in component crates
    3. Whether the enums are renamed: does CalendarAlgorithm become icu::calendar::preferences::Algorithm? Or does CollatorCaseFirst become icu::collator::preferences::CaseFirst?
  • @Manishearth On (3), I think it's fine to repeat "calendar" and "collator" in the enum names. It is better to make it obvious that the enums are the same type.
  • @sffc On (1) and (2), I prefer to export the enums as icu_cratename::preferences::EnumName where cratename is locale and zero or more component crates.
  • @Manishearth That seems consistent.
  • @hsivonen Curerntly we have CollatorCaseFirst and a CaseFirst, what does this say about the relationship of the two?
  • @sffc it's a doc(inline) reexport. It's renamed and exported. We would change it back to CollatorCaseFirst. Not sure about whether to keep doc(inline).
  • @Manishearth It's confusing for the enums to have doc(inline) and not point to their source.
  • @Manishearth: note it's not doc(inline) , it's a cross-crate reexport so automatically inlined. perhaps we should no_inline, or document it as a reexport. This might be worth fixing in rustdoc.
  • @sffc I think we should try to avoid the doc inlining.
  • @robertbastian I agree
  • @Manishearth: yes, but stretch/2.x issue.
  • @robertbastian See my comment in Hoist preference enums to shorten imports #5915 (comment)
  • @Manishearth I could see the transform enums being under transform, but the unicode ones could still be up one level? Ultimately transform extensions are also defined by Unicode.
  • @robertbastian But this is not how the untyped extensions work.
  • @Manishearth I think the untyped extensions module is for power users, and I don't necesarily agree with the way it is layed out for external ICU4X users
  • @hsivonen How did we end up in a situation where 3 of the collator enums that have a two-letter Unicode extension are from locale and re-exported, but the other ones are still plain enums in icu_collator? How is this left partway?
  • @sffc What is partway?
  • @hsivonen https://unicode.org/reports/tr35/tr35-collation.html#Setting_Options vs. https://github.com/unicode-org/icu4x/blob/main/components/collator/src/lib.rs#L325
  • @sffc Maybe it's because ECMA-402 lists some of these as preferences and others as options. https://tc39.es/ecma402/#sec-intl.collator
  • @Manishearth We should discuss in an issue and maybe fix it in 2.0. If the CollatorOptions becomes empty, that's fine.
  • @Manishearth The extensions::transform vs extensions::unicode exist because of Key and Value types. That doesn't apply to preferences.
  • @sffc If transform wants to add an hour cycle enum, it should be exported as TransformHourCycle. Following this line of reasoning, the Unicode Extension hour cycle enum could be UnicodeHourCycle or UHourCycle. But, I propose making it HourCycle because it is the hour cycle that most people will need.

Proposal for 2.0:

  1. Re-affirm previous conclusion that the enums are exported as icu::locale::preferences::EnumName
  2. Do not re-export anything (roll back the re-export in icu_collator).

Agreed: @Manishearth, @sffc

Disagrees: @robertbastian, who prefers icu::locale::preferences::unicode::EnumName. Manish is fine with this as long as the goals below also happen for 2.0

Proposal for 2.0 Stretch:

  1. Re-export the enums in the component crates in which they are used as icu::component::preferences::EnumName
  2. Ensure that the re-exports point back to their original definition in icu::locale by either using no_inline or by manually adding this information to the doc string.
  3. Do not rename the enums when re-exported.

Agreed: @Manishearth, @sffc, @robertbastian

Side issue: @hsivonen to fix icu_collator's use of options vs preferences.

@Manishearth Manishearth removed the discuss-priority Discuss at the next ICU4X meeting label Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-locale Component: Locale identifiers, BCP47 S-small Size: One afternoon (small bug fix or enhancement)
Projects
None yet
Development

No branches or pull requests

5 participants