You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here are the differences from PCRE2 that I've run into:
Operators
No support for \K.
No support for conditionals.
Does support bounded quantifiers (such as ? and {2,5}) in lookbehind.
Does not support recursion (?R) (haven't run into this one, but Wikipedia lists it).
Flags
These haven't caused issues for me, but they are differences.
Doesn't support the g flag, because there is no non-global mode. Ditto u.
Doesn't support UAJD flags.
Supports w flag:
UREGEX_UWORD Controls the behavior of \b in a pattern. If set, word boundaries are found according to the definitions of word found in Unicode UAX 29, Text Boundaries. By default, word boundaries are identified by means of a simple classification of characters as either “word” or “non-word”, which approximates traditional regular expression behavior. The results obtained with the two options can be quite different in runs of spaces and other non-word characters.
Java allows quantifiers (*, +, etc) on zero length tests. ICU does not. Occurrences of these in patterns are most likely unintended user errors, but it is an incompatibility with Java. https://unicode-org.atlassian.net/browse/ICU-6080
ICU recognizes all Unicode properties known to ICU, which is all of them. Java is restricted to just a few.
ICU case insensitive matching works with all Unicode characters, and, within string literals, does full Unicode matching (where matching strings may be different lengths.) Java does ASCII only by default, with Unicode aware case folding available as an option.
ICU has an extended syntax for set [bracket] expressions, including additional operators. Added for improved compatibility with the original ICU implementation, which was based on ICU UnicodeSet pattern syntax.
Flavor Request
Please support ICU. This is the format supported natively by Apple devices, and is used in, e.g., Siri Shortcuts.
The text was updated successfully, but these errors were encountered: