|
| 1 | + Author: Richard Carlsson <carlsson.richard(at)gmail(dot)com> |
| 2 | + Status: Draft |
| 3 | + Type: Standards Track |
| 4 | + Created: 7-Jan-2025 |
| 5 | + Erlang-Version: OTP-28.0 |
| 6 | + Post-History: |
| 7 | +**** |
| 8 | +EEP 75: Based Floating Point Literals |
| 9 | +---- |
| 10 | + |
| 11 | +Abstract |
| 12 | +======== |
| 13 | + |
| 14 | +This EEP adds floating point literals using other bases than ten, such |
| 15 | +as 16 or 2, for exact text representation, as also found in Ada and |
| 16 | +C99/C++17. |
| 17 | + |
| 18 | +Rationale |
| 19 | +========= |
| 20 | + |
| 21 | +Computers represent floating point numbers in binary, but such numbers |
| 22 | +are typically printed using base ten, for example `0.314159265e1`. In |
| 23 | +order to preserve the exact bit level precision when writing a printed |
| 24 | +number as text and later reading it back again, it is better to use a |
| 25 | +base that matches the internally used base, such as 16 for a compact |
| 26 | +but still exact representation, or 2 for visualizing or writing down |
| 27 | +the exact internal format. One particular case where such exact |
| 28 | +representations are useful is in code generating tools. |
| 29 | + |
| 30 | +Some other languages have support for floating point literals in other |
| 31 | +bases. Notably, [C99/C++17 floating-point literals][C99] can be |
| 32 | +written in hexadecimal, as e.g. `0xf.ffp8`, where the `p` indicates |
| 33 | +that the exponent is a power of 2, and in the [Ada][Ada] programming |
| 34 | +language, the corresponding syntax is `16#F.FF#E8`. The latter should |
| 35 | +look familiar to Erlang users, because it is from Ada that the |
| 36 | +`<Base>#<Numeral>` syntax was borrowed. (Ada however requires a final |
| 37 | +`#` even for integers, e.g. `16#fffe#`.) Ada also allows base 2 in a |
| 38 | +floating point number, e.g. `2#0.1111_1111#E8`, using underscores as |
| 39 | +separators just like Erlang does. |
| 40 | + |
| 41 | +Where Erlang differs from Ada is that Ada does not allow the base to |
| 42 | +be larger than 16. Hence, in Ada, `2#111#`, `7#10#`, and `16#7#` are |
| 43 | +all the same number, but `17#7#` is illegal, while Erlang allows any |
| 44 | +base up to 36 in its integer literals, e.g. `36#z`. It should also be |
| 45 | +noted that in the C99 hexadecimal literals, the letter `p` for the |
| 46 | +exponent is a valid digit in bases above 25 in Erlang, whereas C99 |
| 47 | +only allows digits up to `f` (and could not use `e` for exponents in |
| 48 | +hex). Because the Ada notation requires a `#` character before the |
| 49 | +exponent, it has no ambiguity between digits and exponent indicator. |
| 50 | + |
| 51 | +Staying with the Ada notation then seems to be the wise choice, both |
| 52 | +for consistency and because it makes it trivial to keep allowing any |
| 53 | +base up to 36 also for floating point literals in Erlang. |
| 54 | + |
| 55 | +Examples: |
| 56 | + |
| 57 | + 2#0.111 |
| 58 | + 2#0.10101#e8 |
| 59 | + 16#ff.ff |
| 60 | + 16#fefe.fefe#e16 |
| 61 | + 32#vrv.vrv#e15 |
| 62 | + |
| 63 | +It should be noted that both the base and the exponent are always |
| 64 | +interpreted in base ten. Only the digits between the two `#` |
| 65 | +characters are interpreted using the given base. Because Erlang uses |
| 66 | +the `#` characters at the start of constructs such as maps `#{...}`, |
| 67 | +we do not want to allow a final trailing `#` in a number, like Ada |
| 68 | +does. If there is a second `#`, it must be followed by the exponent. |
| 69 | + |
| 70 | +Specification |
| 71 | +======================== |
| 72 | + |
| 73 | +In addition to the current based notation: |
| 74 | + |
| 75 | + base # based_numeral |
| 76 | + |
| 77 | +(borrowing Ada's terminology), where `base` is a decimal number and |
| 78 | +`based_numeral` is a sequence of digits in `0-9` and `a-z` or `A-Z`, |
| 79 | +optionally separated with `_`, we extend the parser to also allow: |
| 80 | + |
| 81 | + base # based_numeral.based_numeral [ # exponent ] |
| 82 | + |
| 83 | +where `exponent` is the letter `e` or `E` followed by an optionally |
| 84 | +signed decimal number, exactly as in ordinary decimal floating point |
| 85 | +literals. |
| 86 | + |
| 87 | +Reference Implementation |
| 88 | +------------------------ |
| 89 | + |
| 90 | +A [reference implementation][GitHub branch] exists in the |
| 91 | +`hexbinfloat` branch of the author's GitHub account, together with a |
| 92 | +[GitHub pull request][GitHub PR] to the Erlang/OTP repository. |
| 93 | + |
| 94 | +[C99]: https://en.cppreference.com/w/cpp/language/floating_literal |
| 95 | + "C99 and C++17 Floating-point Literal" |
| 96 | + |
| 97 | +[Ada]: https://ada-lang.io/docs/arm/AA-2/AA-2.4#242--based-literals |
| 98 | + "Ada Based Literals" |
| 99 | + |
| 100 | +[GitHub branch]: https://github.com/richcarl/otp/tree/hexbinfloat |
| 101 | + "Reference implementation branch on GitHub" |
| 102 | + |
| 103 | +[GitHub PR]: https://github.com/erlang/otp/pull/9106 |
| 104 | + "GitHub Pull Request" |
| 105 | + |
| 106 | +Copyright |
| 107 | +========= |
| 108 | + |
| 109 | +This document is placed in the public domain or under the CC0-1.0-Universal |
| 110 | +license, whichever is more permissive. |
| 111 | + |
| 112 | +[EmacsVar]: <> "Local Variables:" |
| 113 | +[EmacsVar]: <> "mode: indented-text" |
| 114 | +[EmacsVar]: <> "indent-tabs-mode: nil" |
| 115 | +[EmacsVar]: <> "sentence-end-double-space: t" |
| 116 | +[EmacsVar]: <> "fill-column: 70" |
| 117 | +[EmacsVar]: <> "coding: utf-8" |
| 118 | +[EmacsVar]: <> "End:" |
| 119 | +[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: " |
0 commit comments