Skip to content

Commit d95474c

Browse files
committed
Update to Unicode 17.0 (except emoji)
1 parent 38ce360 commit d95474c

30 files changed

+1767
-713
lines changed

CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
# Changelog
22

3+
## Unicode v1.21.0
4+
5+
This is the changelog for Unicode v1.21.0 released on September 10, 2025. For older changelogs please consult the release tag on [GitHub](https://github.com/elixir-unicode/unicode/tags)
6+
7+
### Enhancements
8+
9+
* Updates to [Unicode 17.0](https://unicode.org/versions/Unicode17.0.0/) data.
10+
311
## Unicode v1.20.0
412

513
This is the changelog for Unicode v1.20.0 released on September 11, 2024. For older changelogs please consult the release tag on [GitHub](https://github.com/elixir-unicode/unicode/tags)

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ The Elixir standard library does not provide introspection beyond that required
2020

2121
### Unicode version
2222

23-
As of [unicode version 1.20.0](https://hex.pm/packages/unicode/1.20.0) published on September 11th, 2024, [Unicode 16.0](https://www.unicode.org/versions/Unicode16.0.0/) forms the underlying data.
23+
As of [unicode version 1.21.0](https://hex.pm/packages/unicode/1.21.0) published on September 10th, 2025, [Unicode 17.0](https://www.unicode.org/versions/Unicode17.0.0/) forms the underlying data.
2424

2525
## Additional Unicode libraries
2626

@@ -187,7 +187,7 @@ The package can be installed by adding `unicode` to your list of dependencies in
187187
```elixir
188188
def deps do
189189
[
190-
{:unicode, "~> 1.20"}
190+
{:unicode, "~> 1.21"}
191191
]
192192
end
193193
```

data/blocks.txt

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Blocks-16.0.0.txt
2-
# Date: 2024-02-02
3-
# © 2024 Unicode®, Inc.
1+
# Blocks-17.0.0.txt
2+
# Date: 2025-08-01
3+
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
66
#
@@ -228,6 +228,7 @@ FFF0..FFFF; Specials
228228
108E0..108FF; Hatran
229229
10900..1091F; Phoenician
230230
10920..1093F; Lydian
231+
10940..1095F; Sidetic
231232
10980..1099F; Meroitic Hieroglyphs
232233
109A0..109FF; Meroitic Cursive
233234
10A00..10A5F; Kharoshthi
@@ -279,11 +280,13 @@ FFF0..FFFF; Specials
279280
11AB0..11ABF; Unified Canadian Aboriginal Syllabics Extended-A
280281
11AC0..11AFF; Pau Cin Hau
281282
11B00..11B5F; Devanagari Extended-A
283+
11B60..11B7F; Sharada Supplement
282284
11BC0..11BFF; Sunuwar
283285
11C00..11C6F; Bhaiksuki
284286
11C70..11CBF; Marchen
285287
11D00..11D5F; Masaram Gondi
286288
11D60..11DAF; Gunjala Gondi
289+
11DB0..11DEF; Tolong Siki
287290
11EE0..11EFF; Makasar
288291
11F00..11F5F; Kawi
289292
11FB0..11FBF; Lisu Supplement
@@ -304,12 +307,14 @@ FFF0..FFFF; Specials
304307
16B00..16B8F; Pahawh Hmong
305308
16D40..16D7F; Kirat Rai
306309
16E40..16E9F; Medefaidrin
310+
16EA0..16EDF; Beria Erfe
307311
16F00..16F9F; Miao
308312
16FE0..16FFF; Ideographic Symbols and Punctuation
309313
17000..187FF; Tangut
310314
18800..18AFF; Tangut Components
311315
18B00..18CFF; Khitan Small Script
312316
18D00..18D7F; Tangut Supplement
317+
18D80..18DFF; Tangut Components Supplement
313318
1AFF0..1AFFF; Kana Extended-B
314319
1B000..1B0FF; Kana Supplement
315320
1B100..1B12F; Kana Extended-A
@@ -318,6 +323,7 @@ FFF0..FFFF; Specials
318323
1BC00..1BC9F; Duployan
319324
1BCA0..1BCAF; Shorthand Format Controls
320325
1CC00..1CEBF; Symbols for Legacy Computing Supplement
326+
1CEC0..1CEFF; Miscellaneous Symbols Supplement
321327
1CF00..1CFCF; Znamenny Musical Notation
322328
1D000..1D0FF; Byzantine Musical Symbols
323329
1D100..1D1FF; Musical Symbols
@@ -336,6 +342,7 @@ FFF0..FFFF; Specials
336342
1E2C0..1E2FF; Wancho
337343
1E4D0..1E4FF; Nag Mundari
338344
1E5D0..1E5FF; Ol Onal
345+
1E6C0..1E6FF; Tai Yo
339346
1E7E0..1E7FF; Ethiopic Extended-B
340347
1E800..1E8DF; Mende Kikakui
341348
1E900..1E95F; Adlam
@@ -367,6 +374,7 @@ FFF0..FFFF; Specials
367374
2F800..2FA1F; CJK Compatibility Ideographs Supplement
368375
30000..3134F; CJK Unified Ideographs Extension G
369376
31350..323AF; CJK Unified Ideographs Extension H
377+
323B0..3347F; CJK Unified Ideographs Extension J
370378
E0000..E007F; Tags
371379
E0100..E01EF; Variation Selectors Supplement
372380
F0000..FFFFF; Supplementary Private Use Area-A

data/case_folding.txt

Lines changed: 34 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# CaseFolding-16.0.0.txt
2-
# Date: 2024-04-30, 21:48:11 GMT
3-
# © 2024 Unicode®, Inc.
1+
# CaseFolding-17.0.0.txt
2+
# Date: 2025-07-30, 23:54:36 GMT
3+
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
66
#
@@ -18,15 +18,15 @@
1818
# The data supports both implementations that require simple case foldings
1919
# (where string lengths don't change), and implementations that allow full case folding
2020
# (where string lengths may grow). Note that where they can be supported, the
21-
# full case foldings are superior: for example, they allow "MASSE" and "Maße" to match.
21+
# full case foldings are superior: for example, they allow "FUSS" and "Fuß" to match.
2222
#
2323
# All code points not listed in this file map to themselves.
2424
#
2525
# NOTE: case folding does not preserve normalization formats!
2626
#
2727
# For information on case folding, including how to have case folding
28-
# preserve normalization formats, see Section 3.13 Default Case Algorithms in
29-
# The Unicode Standard.
28+
# preserve normalization formats, see the
29+
# "Conformance" / "Default Case Algorithms" section of the core specification.
3030
#
3131
# ================================================================================
3232
# Format
@@ -1243,7 +1243,10 @@ A7C7; C; A7C8; # LATIN CAPITAL LETTER D WITH SHORT STROKE OVERLAY
12431243
A7C9; C; A7CA; # LATIN CAPITAL LETTER S WITH SHORT STROKE OVERLAY
12441244
A7CB; C; 0264; # LATIN CAPITAL LETTER RAMS HORN
12451245
A7CC; C; A7CD; # LATIN CAPITAL LETTER S WITH DIAGONAL STROKE
1246+
A7CE; C; A7CF; # LATIN CAPITAL LETTER PHARYNGEAL VOICED FRICATIVE
12461247
A7D0; C; A7D1; # LATIN CAPITAL LETTER CLOSED INSULAR G
1248+
A7D2; C; A7D3; # LATIN CAPITAL LETTER DOUBLE THORN
1249+
A7D4; C; A7D5; # LATIN CAPITAL LETTER DOUBLE WYNN
12471250
A7D6; C; A7D7; # LATIN CAPITAL LETTER MIDDLE SCOTS S
12481251
A7D8; C; A7D9; # LATIN CAPITAL LETTER SIGMOID S
12491252
A7DA; C; A7DB; # LATIN CAPITAL LETTER LAMBDA
@@ -1616,6 +1619,31 @@ FF3A; C; FF5A; # FULLWIDTH LATIN CAPITAL LETTER Z
16161619
16E5D; C; 16E7D; # MEDEFAIDRIN CAPITAL LETTER O
16171620
16E5E; C; 16E7E; # MEDEFAIDRIN CAPITAL LETTER AI
16181621
16E5F; C; 16E7F; # MEDEFAIDRIN CAPITAL LETTER Y
1622+
16EA0; C; 16EBB; # BERIA ERFE CAPITAL LETTER ARKAB
1623+
16EA1; C; 16EBC; # BERIA ERFE CAPITAL LETTER BASIGNA
1624+
16EA2; C; 16EBD; # BERIA ERFE CAPITAL LETTER DARBAI
1625+
16EA3; C; 16EBE; # BERIA ERFE CAPITAL LETTER EH
1626+
16EA4; C; 16EBF; # BERIA ERFE CAPITAL LETTER FITKO
1627+
16EA5; C; 16EC0; # BERIA ERFE CAPITAL LETTER GOWAY
1628+
16EA6; C; 16EC1; # BERIA ERFE CAPITAL LETTER HIRDEABO
1629+
16EA7; C; 16EC2; # BERIA ERFE CAPITAL LETTER I
1630+
16EA8; C; 16EC3; # BERIA ERFE CAPITAL LETTER DJAI
1631+
16EA9; C; 16EC4; # BERIA ERFE CAPITAL LETTER KOBO
1632+
16EAA; C; 16EC5; # BERIA ERFE CAPITAL LETTER LAKKO
1633+
16EAB; C; 16EC6; # BERIA ERFE CAPITAL LETTER MERI
1634+
16EAC; C; 16EC7; # BERIA ERFE CAPITAL LETTER NINI
1635+
16EAD; C; 16EC8; # BERIA ERFE CAPITAL LETTER GNA
1636+
16EAE; C; 16EC9; # BERIA ERFE CAPITAL LETTER NGAY
1637+
16EAF; C; 16ECA; # BERIA ERFE CAPITAL LETTER OI
1638+
16EB0; C; 16ECB; # BERIA ERFE CAPITAL LETTER PI
1639+
16EB1; C; 16ECC; # BERIA ERFE CAPITAL LETTER ERIGO
1640+
16EB2; C; 16ECD; # BERIA ERFE CAPITAL LETTER ERIGO TAMURA
1641+
16EB3; C; 16ECE; # BERIA ERFE CAPITAL LETTER SERI
1642+
16EB4; C; 16ECF; # BERIA ERFE CAPITAL LETTER SHEP
1643+
16EB5; C; 16ED0; # BERIA ERFE CAPITAL LETTER TATASOUE
1644+
16EB6; C; 16ED1; # BERIA ERFE CAPITAL LETTER UI
1645+
16EB7; C; 16ED2; # BERIA ERFE CAPITAL LETTER WASSE
1646+
16EB8; C; 16ED3; # BERIA ERFE CAPITAL LETTER AY
16191647
1E900; C; 1E922; # ADLAM CAPITAL LETTER ALIF
16201648
1E901; C; 1E923; # ADLAM CAPITAL LETTER DAALI
16211649
1E902; C; 1E924; # ADLAM CAPITAL LETTER LAAM

0 commit comments

Comments
 (0)