Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to define "canonical time zone identifier" in the spec? #8

Open
justingrant opened this issue Mar 23, 2023 · 16 comments
Open

How to define "canonical time zone identifier" in the spec? #8

justingrant opened this issue Mar 23, 2023 · 16 comments

Comments

@justingrant
Copy link
Collaborator

This proposal should define what "canonical time zone identifier" means in ECMAScript. Creating the text for this definition is part of Step 4 as described in the README.

Unfortunately "A canonical identifier is a Zone in TZDB" is too vague because it depends on the TZDB build options. Some build options, like the default MAKEFILE run without options, generate Links (like Atlantic/Reykjavik => Africa/Abidjan) that are not appropriate for ECMAScript use. So a more precise definition is needed that will push implementers to avoid those aggressive Links.

One possible starting point for this precise definition is capturing whatever Firefox is doing now as described in this comment and later comments. Another possible starting point is to align with the output of the global-tz fork of TZDB.

What both of these seem have in common is using zone.tab to back out the controversial, problematic merges like Atlantic/Reykjavik => Africa/Abidjan, Europe/Stockholm => Europe/Berlin, Europe/Zagreb => Europe/Belgrade. So if an identifier is listed as a Zone in zone.tab, then it should be canonical in ECMAScript.

What I'm unsure about is whether we want or need to use the old data in backzone as well. We'll be looking to @anba (from Firefox), @Yqwed (from Android), and others for advice.

Once we understand what we want "canonical" to mean, the next step will be crafting specific spec text for that definition suitable for a PR into the ECMASCript spec. One possible definition will be including specific MAKEFILE options (for building TZDB) into the spec. Another possibility is describing in words (e.g. "every identifier that's a Zone in zone.tab must be canonical in ECMAScript") and leave it up to implementers to translate that into build options. The global-tz README could be a starting point for that kind of text.

@justingrant
Copy link
Collaborator Author

justingrant commented Jun 28, 2023

Here's the text we now have in the 402 section of the Temporal proposal. Unless we hear implementer or user concerns here in this issue, we'll assume that this text below is OK.

If CLDR ends up defining a clearer requirement, we can revise the text below in a normative PR to ECMA-402, outside of the scope of this proposal.

    <emu-note>
      <p>
        The IANA Time Zone Database offers build options that affect which time zone identifiers are primary.
        The default build options merge different countries' time zones, for example *"Atlantic/Reykjavik"* being a Link to the Zone *"Africa/Abidjan"*.
        Geographically and politically distinct locations are likely to introduce divergent time zone rules in a future version of the IANA Time Zone Database.
        Therefore, it is recommended that ECMAScript implementations instead use build options such as <code>PACKRATDATA=backzone PACKRATLIST=zone.tab</code> or a similar alternative that ensures at least one primary identifier for each <a href="https://www.iso.org/glossary-for-iso-3166.html">ISO 3166-1 Alpha-2</a> country code.
      </p>
    </emu-note>

@anba
Copy link
Contributor

anba commented Jun 28, 2023

Do we want/need to make a clearer distinction between resolving links to zones and actual time zone transition data? For example ICU doesn't provide any time zone transition data for any time zone rule from backzone:

js> new Temporal.TimeZone("Atlantic/Reykjavik").id
"Atlantic/Reykjavik"
js> new Temporal.TimeZone("Atlantic/Reykjavik").getNextTransition("1800-01-01T00:00:00Z").toString()
"1912-01-01T00:16:08Z"

This is the first transition from Africa/Abidjan, not from Atlantic/Reykjavik.

Kind of related: https://mm.icann.org/pipermail/tz/2023-June/032998.html

@justingrant
Copy link
Collaborator Author

My weakly-held opinion is that it would be ideal if ICU used backzone data, and I think that it's reasonable to encourage them to do so. But they may have good reasons not to do this. For example, if it would add 10MB to browser downloads, then I'd probably want to live with the current weird pre-1970 data. I'd want to understand more about why they've chosen to not use backzone, especially if it is intentional on their part.

But Im also not sure that inaccuracy of pre-1970 data is a huge problem. First, it's been this way for quite a while and doesn't seem to be generating as many bugs as the Calcutta/Kyiv issue. I suspect this is because almost all ECMAScript programs deal with post-1970 data only.

And those that do deal with pre-1970 data, it's likely that all they care about are dates, for which Temporal.PlainDate can be used which avoids TZDB completely. So I suspect that this problem gets better with Temporal, so I'd probably want to wait until Temporal is widely adopted to see if it really needs to be fixed.

That's why I suspect that the current spec text, which makes a recommendation but does not require backzone, is probably OK for now. The recommendation was written to give implementers (or ICU!) leeway to choose alternate ways to build TZDB as long as there's at least one Zone per country. Do you think this is OK? Or do you think that we should use different spec text?

If at some point later we get ICU and implementers to align on a particular solution, then I think we can (outside of this proposal) open a normative PR against 402 to change that text. For example, once ICU surfaces the IANA canonical ID for all CLDR zones, then I'd support changing that text to refer to ICU instead of IANA build options.

@anba
Copy link
Contributor

anba commented Jun 28, 2023

The recommendation was written to give implementers (or ICU!) leeway to choose alternate ways to build TZDB as long as there's at least one Zone per country. Do you think this is OK? Or do you think that we should use different spec text?

My concerns were about differences between time zone id canonicalisation and time zone transition rules. ICU basically builds TZDB two times, using different build options (*):

  • The information for canonical time zone identifiers is built using PACKRATDATA=backzone PACKRATLIST=zone.tab.
  • Whereas the time zone transition rules are built using the default TZDB options, i.e. without specifying PACKRATDATA and PACKRATLIST.

So we end up with a Zone per country, but the Zone's transition rules are always from a non-backzone time zone.


(*) This isn't really what happens, the canonical time zone identifiers are actually derived from CLDR (plus ICU specific overrides), but let's ignore that difference for now.

@Yqwed
Copy link

Yqwed commented Jun 28, 2023

Therefore, it is recommended that ECMAScript implementations instead use build options such as PACKRATDATA=backzone PACKRATLIST=zone.tab or a similar alternative that ensures at least one primary identifier for each ISO 3166-1 Alpha-2 country code.

Does IANA guarantee that? AFAIU from IANA's perspective (or Paul's) the only thing that backzone does is that it clarifies pre-1970 transitions. Also, I don't think IANA guarantees "at least one time zone per country".

Do you really want to use geopolitically sensitive term as "country" in the spec?

Also

or a similar alternative that ensures at least one primary identifier for each

backzone does not introduce new time zones, I think that's inaccurate.

The IANA Time Zone Database offers build options that affect which time zone identifiers are primary.

I don't think IANA makes such claims. They also don't use terms as primary.

Personally, I find this note confusing.

@justingrant
Copy link
Collaborator Author

Do you really want to use geopolitically sensitive term as "country" in the spec?

The spec deliberately avoids "country" and instead says "ISO 3166-1 Alpha-2 country code", so I think it's OK because we're delegating the difficult geopolitical decision of "what is a country" to ISO, and simply agreeing to follow their decisions.

Is there different wording that you think would work better?

Does IANA guarantee that?

My understanding is that zone.tab guarantees at least one Zone per ISO 3166-1 Alpha-2 country code, so PACKRATLIST=zone.tab provides the guarantee that ECMAScript is looking for.

My concerns were about differences between time zone id canonicalisation and time zone transition rules. ICU basically builds TZDB two times, using different build options (*):

  • The information for canonical time zone identifiers is built using PACKRATDATA=backzone PACKRATLIST=zone.tab.
  • Whereas the time zone transition rules are built using the default TZDB options, i.e. without specifying PACKRATDATA and PACKRATLIST.

So we end up with a Zone per country, but the Zone's transition rules are always from a non-backzone time zone.

(*) This isn't really what happens, the canonical time zone identifiers are actually derived from CLDR (plus ICU specific overrides), but let's ignore that difference for now.

Yep, this matches my understanding of how ICU works. I agree that this is a problem, and is something that we should engage with ICU to see if they're willing to change it to be more consistent.

I think the first priority is to get CLDR to report the current IANA canonical ID. Once that change is made, then I think we should engage with ICU (perhaps as part of the planning for the API to expose the IANA canonical ID in ICU's APIs?) to figure out how to solve the backzone data issues.

What do you think?

@Yqwed
Copy link

Yqwed commented Jun 30, 2023

My understanding is that zone.tab guarantees at least one Zone per ISO 3166-1 Alpha-2 country code, so PACKRATLIST=zone.tab provides the guarantee that ECMAScript is looking for.

zone.tab (and zone1970.tab) groups existing time zones. Theory page does not even mention country. Stephen Colebourne suggested adding "at least one time zone per OSI country" guarantee, but AFAIK it is still only a proposal.

current IANA canonical ID

Please use different term. IANA has no canonical IDs and it might mean different thing to different people.

or a similar alternative that ensures at least one primary identifier for each

Build options do not change set of existing / available / acceptable time zones. Also neither this spec defines time zones nor implementers can create their owns ones.

Is there different wording that you think would work better?

I don't understand why spec needs to mention country / country codes in the first place. What is the idea behind proposed note?

@justingrant
Copy link
Collaborator Author

I don't understand why spec needs to mention country / country codes in the first place. What is the idea behind proposed note?

The goal is to emphasize that TZDB's default build options, which currently merge multiple countries into one Zone, are not recommended for ECMAScript, because it'd mean that time zone changes in one country can affect time zone data in a completely different country. For example, it'd be bad if I show up at the wrong time for my meeting in Reykjavik next year just because Cote d'Ivoire changed its time zone rules.

Instead, we want implementations to follow CLDR's approach: do not deprecate all time zones for a particular country code, even if all those Zones are turned into Links in the IANA TZDB. Also, because most (all?) major ECMAScript implementations rely on CLDR (via ICU), our assumption is that this recommendation won't require implementations to make any changes.

Without mentioning country codes, I'm not sure how we can make the same points above. Do you have other text in mind that could do it better?

My understanding is that zone.tab guarantees at least one Zone per ISO 3166-1 Alpha-2 country code, so PACKRATLIST=zone.tab provides the guarantee that ECMAScript is looking for.

zone.tab (and zone1970.tab) groups existing time zones. Theory page does not even mention country. Stephen Colebourne suggested adding "at least one time zone per OSI country" guarantee, but AFAIK it is still only a proposal.

As I understand it, zone.tab is the data source for programs that allow users to map from an ISO 3166-1 Alpha-2 country code to one or more time zones that are used with that country code. I suspect for backwards compatibility reasons, it doesn't use the same time zone ID (which may be a Zone or a Link) for multiple country codes. Which is good for ECMAScript use.

current IANA canonical ID

Please use different term. IANA has no canonical IDs and it might mean different thing to different people.

Yep, in the ECMAScript spec we've been calling this "primary time zone identifier" and "non-primary time zone identifier". Do you like these terms?

Here's the relevant part of the ECMA-262 current spec that defines these terms: https://tc39.es/ecma262/#sec-time-zone-identifiers. And here's the proposed spec text in the Temporal proposal that describes how the IANA TZDB relates to these terms: https://tc39.es/proposal-temporal/#sec-use-of-iana-time-zone-database.

Would you like to see changes in either of those spec text sections? If yes, what changes do you think are needed?

@Yqwed
Copy link

Yqwed commented Jul 10, 2023

The goal is to emphasize that TZDB's default build options, which currently merge multiple countries into one Zone, are not recommended for ECMAScript, because it'd mean that time zone changes in one country can affect time zone data in a completely different country. For example, it'd be bad if I show up at the wrong time for my meeting in Reykjavik next year just because Cote d'Ivoire changed its time zone rules.

I am sure you know that it's not how timezones work after the merge :)

Instead, we want implementations to follow CLDR's approach: do not deprecate all time zones for a particular country code, even if all those Zones are turned into Links in the IANA TZDB. Also, because most (all?) major ECMAScript implementations rely on CLDR (via ICU), our assumption is that this recommendation won't require implementations to make any changes.

Link-ed timezones are not really deprecated. Or at least I haven't seen anything like that.

As I understand it, zone.tab is the data source for programs that allow users to map from an ISO 3166-1 Alpha-2 country code to one or more time zones that are used with that country code. I suspect for backwards compatibility reasons, it doesn't use the same time zone ID (which may be a Zone or a Link) for multiple country codes. Which is good for ECMAScript use.

Theory page explicitly says that:

Each main entry in the database represents a timezone for a set of civil-time clocks that have all agreed since 1970

and you've seen outcome of the recent merges. And there is nothing about zone.tab file at all. So saying "zone.tab guarantees" is quite a stretch unless you ask Paul.

Without mentioning country codes, I'm not sure how we can make the same points above. Do you have other text in mind that could do it better?

Is your goal to treat Europe/Berlin and Europe/Oslo as different entities?

Would you like to see changes in either of those spec text sections? If yes, what changes do you think are needed?

Yeah, it sounds good. One can nit-pick whether GMT / GMT+0 / GMT-0 / Etc/GMT / Etc/GMT+0 / Etc/GMT-0 / Etc/UTC are non-primary for UTC (I am referring to is either a primary time zone identifier or a non-primary time zone identifier), but probably no one would really care.

@justingrant
Copy link
Collaborator Author

Without mentioning country codes, I'm not sure how we can make the same points above. Do you have other text in mind that could do it better?

Is your goal to treat Europe/Berlin and Europe/Oslo as different entities?

Yes. Even if Germany and Norway use the same time zone rules today, they might not do so in the future. So we believe that there's value in allowing end-users and developers to use the most granular time zone identifier that corresponds to a desired location. If that per-country ID is persisted somewhere, then it's immune to time zone rule changes that happen in another country. It's still not a perfect solution, because rules can change in any country, but by retaining per-country-code IDs (and exposing them as primary in ECMAScript APIs like Intl.supportedValuesOf('timeZone')), we at least reduce the scope that can be broken by future time zone rule changes.

For example, it'd be bad if I show up at the wrong time for my meeting in Reykjavik next year just because Cote d'Ivoire changed its time zone rules.

I am sure you know that it's not how timezones work after the merge :)

Yep, per above I'm less concerned about the current state (which after the merge is identical between merged IDs) and more concerned about making persisted IDs tolerate future changes. So "show up at the wrong time for my meeting in Reykjavik next year just because Cote d'Ivoire changed its time zone rules" really is the failure case that we're trying to prevent. For example, imagine a calendar app that serialized a Temporal.ZonedDateTime instance into a database row that represents a conference session in Iceland next year. For such a timestamp, 2024-08-10T10:00+00:00[Atlantic/Reykjavik] or 2024-08-10T10:00+00:00[Africa/Abidjan] might yield the same results with today's rules, but if Cote d'Ivoire decided to change its timezone next year, I'd much rather have the former timestamp string persisted than the latter.

Was that what you meant? Or did I misunderstand what you were trying to say?

Each main entry in the database represents a timezone for a set of civil-time clocks that have all agreed since 1970

and you've seen outcome of the recent merges. And there is nothing about zone.tab file at all. So saying "zone.tab guarantees" is quite a stretch unless you ask Paul.

Yeah, I agree. But backwards compatibility does seems to be really important to Paul, and he's bent over backwards to ensure that there are build options that can replicate the global-tz fork. And even if zone.tab were to be somehow eliminated (which seems unlikely), we're insulated by CLDR from direct impact. So the worst possible outcome if zone.tab goes bad (where "bad" means either it's removed or it stops having one row per country code) would be that we'd need to change the ECMAScript spec to point to CLDR instead of zone.tab.

So I'm not really worried about it for the time being. What do you think?

@Yqwed
Copy link

Yqwed commented Jul 11, 2023

I have a feeling that these problems stem from treating Link as "use TARGET, LINK-NAME is deprecated from now on". IANA does not enforce that, but I do agree that saying Link Europe/Berlin XXX exists as links for backward compatibility is weird. My point is to be careful with any assumption around IANA other than "set of existing timezones" and "no guarantees for any data before 1970".

For example, imagine a calendar app that serialized a Temporal.ZonedDateTime instance into a database row that represents a conference session in Iceland next year. For such a timestamp, 2024-08-10T10:00+00:00[Atlantic/Reykjavik] or 2024-08-10T10:00+00:00[Africa/Abidjan]

Links issue again. It probably would've worked fine if no extra "canonicalisation" step was taken.

Yeah, I agree. But backwards compatibility does seems to be really important to Paul, and he's bent over backwards to ensure that there are build options that can replicate the global-tz fork. And even if zone.tab were to be somehow eliminated (which seems unlikely), we're insulated by CLDR from direct impact. So the worst possible outcome if zone.tab goes bad (where "bad" means either it's removed or it stops having one row per country code) would be that we'd need to change the ECMAScript spec to point to CLDR instead of zone.tab.

Are you sure something that 1) over which you have no control and 2) which does not promise anything should be used in the spec?

Are country codes needed when there are primary timezone identifiers?

@justingrant
Copy link
Collaborator Author

Links issue again. It probably would've worked fine if no extra "canonicalisation" step was taken.

Yep, this proposal will reduce the scope of the problem considerably by stopping canonicalization in all cases where IDs are provided by users. But canonicalized cases will still remain where ECMAScript is providing the IDs:

  • The system's current time zone, as output by Temporal.Now.timeZoneId() and Intl.DateTimeFormat.resolvedOptions().timeZone.
  • The list of available time zones output by Intl.supportedValuesOf('timeZone').

Given that canonicalization is still exposed in these two cases, I don't think we should be following IANA's merges because those IDs will be used in ECMAScript programs and sometimes persisted, leading to the "meeting in Reykjavik" problem noted above.

I'm not sure how the "show me this user's current time zone" and "show me a list of time zone IDs I can localize and then use in a GUI time zone chooser" cases can work as expected without ignoring the merges in IANA's default build options. Do you have another solution in mind?

Are country codes needed when there are primary timezone identifiers?

I'm not sure I understand the question. Could you explain a bit more what you mean?

Are you sure something that 1) over which you have no control and 2) which does not promise anything should be used in the spec?

The current spec language (in https://tc39.es/proposal-temporal/#sec-use-of-iana-time-zone-database) offers a lot of wiggle room in case zone.tab breaks backwards compatibility:

it is recommended that ECMAScript implementations instead use build options such as PACKRATDATA=backzone PACKRATLIST=zone.tab or a similar alternative that ensures at least one primary identifier for each ISO 3166-1 Alpha-2 country code.

I think this language is fairly clear that "at least one primary identifier for each ISO 3166-1 Alpha-2 country code" is the goal and there can be many ways to achieve that goal, only one of which is using zone.tab. So even if zone.tab stops satisfying the "at least one primary identifier for each ISO 3166-1 Alpha-2 country code" recommendation, I'm not worried that implementers will get it wrong. Especially given that all major current implementations are going through CLDR and ICU (which have similar output to the zone.tab list and which also won't follow IANA's merges).

What do you think?

@Yqwed
Copy link

Yqwed commented Jul 12, 2023

Given that canonicalization is still exposed in these two cases, I don't think we should be following IANA's merges because those IDs will be used in ECMAScript programs and sometimes persisted, leading to the "meeting in Reykjavik" problem noted above.

As I understand it, merge according to IANA means no more than "rules for timezone X are exactly the same as for Y". So when they added Link Europe/Berlin Europe/Oslo it wasn't "from now on use Europe/Berlin, forget about Europe/Oslo". The only meaning is "Europe/Oslo and Europe/Berlin had the same rules after 1970"*. That said, I don't think that "following IANA's merges" couldn't be used in the context other than "do we care about pre-1970 offsets?" like questions.
As you're altering the spec and suggesting to use IANA, you can clarify these moments.

I'm not sure how the "show me this user's current time zone" and "show me a list of time zone IDs I can localize and then use in a GUI time zone chooser" cases can work as expected without ignoring the merges in IANA's default build options. Do you have another solution in mind?

I don't think that timezone merges and timezone picker UI are related. On Android we maintain such file. It is probably somewhat close to proposed primary timezones list grouped by country code (it is not a spec and we don't use country word on UIs).

Are country codes needed when there are primary timezone identifiers?

I'm not sure I understand the question. Could you explain a bit more what you mean?

Oh, you give some explanation on how primary IDs are defined. It will be hard to do it w/o mentioning them then.

What do you think?

Text looks fine. Though

it is recommended that ECMAScript implementations instead use build options such as PACKRATDATA=backzone PACKRATLIST=zone.tab or a similar alternative that ensures at least one primary identifier for each ISO 3166-1 Alpha-2 country code.

is incorrect. You assume that Link is deprecation / soft removal, but it is not. It is very likely that TZif files at your machine (if it is UNIX/MacOS) were built w/o PACKRATDATA=backzone flag, but TZ=Europe/Oslo date will work just fine.

Especially given that all major current implementations are going through CLDR and ICU (which have similar output to the zone.tab list and which also won't follow IANA's merges).

"won't follow merges" - is it regarding pre-1970 data?


  • Of course, it raises questions like "Why Berlin then? Why not vice-versa?". But it is orthogonal to this discussion.

@Yqwed
Copy link

Yqwed commented Jul 13, 2023

Oh, I forgot to mention that there is no guarantee that a timezone is mapped to one single country. Be careful with assumptions!

@justingrant
Copy link
Collaborator Author

justingrant commented Jul 14, 2023

I don't think that timezone merges and timezone picker UI are related. On Android we maintain such file. It is probably somewhat close to proposed primary timezones list grouped by country code (it is not a spec and we don't use country word on UIs).

Thanks, this helps me understand your feedback better. I *do* think that timezone merges and timezone picker UI are related, because in order to build a simple timezone picker UI in a web app (for example, a "what time zone do you want to use for your results?" dropdown box in a reporting UI), a developer needs a way to get a list of IDs that are available to be chosen. Those IDs then need to be localized into the user's language and shown in a picker UI, like a dropdown box.

There are a few ways for an ECMAScript developer to get that list of available IDs into their web app:

  1. They can get a list from somewhere and ship the list in their app. This is problematic because the developer must keep the list up to date.
  2. ECMAScript could offer a new API that returns a list of all the Zone and Link names in that system's TZDB. This is problematic because the resulting UI would have multiple entries for the same time zone, for example two entries for Ukraine because both Europe/Kiev and Europe/Kyiv would be in the list of IDs.
  3. The developer could use the existing Intl.supportedValuesOf('timeZone') API, which returns a list of primary time zone identifiers.
  4. A combination of (2) and (3): ECMAScript could add a new API exposing all valid IDs, and then the developer could pick their own way to decide which of those IDs should be used in the UI picker. This seems like the worst of all options, because it's more work for the developer, it's less consistent across different ECMAScript programs, and it might cause outdated IDs like US/Indiana and Europe/Kiev to be persisted.

I assume that (3) is the best solution, because it avoids the problems of 1/2/4, because it's the easiest DX, and because the API already exists. Do you agree? And if not, what alternative do you think would work better?

A challenge with (3) is that ECMAScript needs to pick which IDs are primary and which are non-primary. In a perfect world, primary IDs could just be the Zone names in a TZDB built with the default build options. But those defaults merge multiple countries together, which is generally not what most implementations prefer. Instead, most implementations seem to be converging on a list of canonical IDs that is identical to (or very close to) the list of IDs in zone.tab.

This is true for CLDR's list , which is identical to zone.tab except for 19 outdated IDs like Asia/Calcutta and Asia/Saigon. We're working with CLDR to extend CLDR data so that ECMAScript could surface the current IDs for those 19. See https://unicode-org.atlassian.net/browse/CLDR-14453 (which will hopefully result in CLDR exposes the current IANA Zone name for those 19) and tc39/ecma402#806 (which will hopefully result in ECMAScript overriding CLDR's outdated IDs until a long-term CLDR solution is avaialble).

Firefox uses backzone instead of zone.tab, but FF's list is also quite similar to zone.tab's and I'm checking to see if FF can move to use zone.tab instead.

The Android list you sent seems pretty closely aligned to zone.tab too. How is Android's list maintained?

So my current opinion is that every ID in zone.tab should be primary. What do you think?

it is recommended that ECMAScript implementations instead use build options such as PACKRATDATA=backzone PACKRATLIST=zone.tab or a similar alternative that ensures at least one primary identifier for each ISO 3166-1 Alpha-2 country code.

is incorrect. You assume that Link is deprecation / soft removal, but it is not. It is very likely that TZif files at your machine (if it is UNIX/MacOS) were built w/o PACKRATDATA=backzone flag, but TZ=Europe/Oslo date will work just fine.

Yeah, it's an interesting question about whether PACKRATDATA=backzone PACKRATLIST=zone.tab or just PACKRATLIST=zone.tab is best. Honestly I don't have a strong opinion about that. My only fairly strong opinion is that the IDs in zone-tab should be canonical. Whether we bring in older data from backzone is unclear; I can see pros and cons of doing that.

Especially given that all major current implementations are going through CLDR and ICU (which have similar output to the zone.tab list and which also won't follow IANA's merges).

"won't follow merges" - is it regarding pre-1970 data?

By "won't follow merges" I mean ignoring the changes like turning Atlantic/Reykjavik and Europe/Oslo into non-primary identifiers, even though the default build options of the current TZDB make these to be Links.

I don't have a strong opinion about pre-1970 data, because very little software written in ECMAScript deals with time-of-day info before 1970. I care a lot more about minimizing the impact of future changes to TZDB. For example, if Cote D'Ivoire changed its time zone offset but Iceland didn't (or vice versa), then it'd be really helpful if both countries had different primary IDs in ECMAScript.

Also, pre-1970 is already broken in all major ECMAScript implementations, where "broken" means that data from backzone is already ignored for IDs like Europe/Oslo that are merged in TZDB's default build options. For example, time zone data for Norway and Germany are identical in all major browsers. Example:

Temporal.ZonedDateTime.from('1800-01-01[Europe/Berlin]').offset
// => '+00:53:28'
Temporal.ZonedDateTime.from('1800-01-01[Europe/Oslo]').offset
// => '+00:53:28' 
// Europe/Oslo is 0:43:00 in backzone file.

This omission of backzone data does occasionally cause some user complaints, but not nearly as many as the complaints that come from Chrome and Safari using outdated IDs like Asia/Calcutta, Europe/Kiev, and Asia/Saigon.

So if the best answer is to omit backzone, I'd be OK with that (and we can update the recommendation in the spec accordingly). What do you think?

no guarantee that a timezone is mapped to one single country.

In the current zone.tab, there are no IDs listed in that file are used across multiple country codes. This is why I am proposing using zone.tab. I realize that zone1970.tab does use the same ID for multiple country codes, which is I'm proposing to not use that file to determine ECMAScript's primary IDs. Or were you suggesting something different and I'm not understanding what you mean?

@Yqwed
Copy link

Yqwed commented Jul 17, 2023

Thanks, this helps me understand your feedback better. I do think that timezone merges and timezone picker UI are related

We probably have different assumptions / guarantees around IANA data and APIs provided to users and their behaviour. Stock Android's timezone picker does not care about merges and we've made no code changes when we followed ICU's decision to accept merges.

I assume that (3) is the best solution, because it avoids the problems of 1/2/4, because it's the easiest DX, and because the API already exists. Do you agree? And if not, what alternative do you think would work better?

Timezone picker is not a trivial problem and dropdown list is simple yet not always elegant solution. If you see it as an UI-centric problem to make the majority of developers happy, then (3) sounds good. Though I have no data on browsers update saturation data and how bad version skew issues might be.

And I am of course biased - I like Android's current "select region > select timezone" approach more than long dropdown list :)

The Android list you sent seems pretty closely aligned to zone.tab too. How is Android's list maintained?

It is maintained manually. We used to validate it against zone.tab, but since aosp/1909188 it is not.

By "won't follow merges" I mean ignoring the changes like turning Atlantic/Reykjavik and Europe/Oslo into non-primary identifiers, even though the default build options of the current TZDB make these to be Links.

So you understanding of IANA data is different from Paul Eggert's (not blaming!).

In the current zone.tab, there are no IDs listed in that file are used across multiple country codes. This is why I am proposing using zone.tab. I realize that zone1970.tab does use the same ID for multiple country codes, which is I'm proposing to not use that file to determine ECMAScript's primary IDs. Or were you suggesting something different and I'm not understanding what you mean?

Nope, I just mentioned it because it came in an internal conversation. I don't think you rely on that, but something to keep in mind.

So if the best answer is to omit backzone, I'd be OK with that (and we can update the recommendation in the spec accordingly). What do you think?

I don't have an opinion on that. What matters is how Link-s are treated, and maybe even that is not important with proposed Intl.supportedValuesOf('timeZone') behaviour?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants