Conversation
|
Holding this here for a short while. I need to add additional tests in other parsers/funcs |
| } | ||
|
|
||
| i, err := strconv.ParseInt(stringValue, 10, 64) | ||
| i, err := strconv.ParseInt(stringValue, 0, 64) |
There was a problem hiding this comment.
I have been there. I would discourage doing this.
YAML spec is not linked to what Go supports with base 0
Add tests for 0x42 0X42 0o42 0b11 0B11
I feel like at least 0B11 0X42 are invalid in YAML, while valid in Go
Also 1_000 is valid for YAML spec 1.2, but not 1.1
0775 is the decimal number 775 is YAML 1.2, and the octal number 775 in YAML spec 1.1
Go parse 0775 as the octal number 775
1e3 is valid in YAML, but invalid for strconv.ParseInt but valid for strconv.ParseFloat
Please read strconv.ParseInt documentation and check what YAML spec would expect
There was a problem hiding this comment.
func_to_int is fine to use go conventions, but all very good points for the yaml parser.
I'll revisit the implementation.
Thank you for the in-depth details! While I'm familiar with all these markup languages, I definitely haven't gotten into the finer details like octal parsing for example
There was a problem hiding this comment.
It's been a while, but I eventually got back to this. I've added some special hanging which I think covers some edge cases.
Do you have any thoughts or test cases you think I've missed? I'm no YAML expert
09642a1 to
01434f5
Compare
| `, | ||
| }.run) | ||
|
|
||
| t.Run("base numbers", func(t *testing.T) { |
There was a problem hiding this comment.
Here are some edge cases you could consider
-
Explicit plus sign (+) with integers / prefixed bases
The implementation strips a leading sign before prefix detection, but no tests cover the+form. -
+42->42 -
+0x10->16 -
+0o10->8 -
+0b10->2 -
-42is tested, but not negative hex/octal/binary. -
-0x10->-16 -
-0o10->-8 -
-0b10->-2 -
Numeric separators
_ -
1_000->1000 -
0xFF_FF->65535 -
0b1010_1010->170
Go’s strconv.ParseInt / ParseFloat won’t accept _ directly, so this needs either normalization or a documented rejection + tests.
| return "", model.NewFloatValue(f), nil | ||
| case unstable.Integer: | ||
| i64, err := strconv.ParseInt(string(n.Data), 10, 64) | ||
| i64, err := strconv.ParseInt(string(n.Data), 0, 64) |
There was a problem hiding this comment.
I'm not an expert in TOML, but I took a quick look at the spec
- Please add tests for these:
Valid
42, -42, 0x12, 0o7, 0b111, 12_000, -12_000
Valid but as string: "042"
Invalid
-0x12, -0o7, -0b111, 042
All these values are valid with base 0 with ParseInt but not in TOML
Here are also feedbacks provided by an LLM
# ==========================================================
# HEXADECIMAL, OCTAL, AND BINARY BASES
# ==========================================================
# --- Valid (Case Insensitivity for Prefixes and Values) ---
hex_lowercase = 0xff12 # Standard hex
hex_uppercase = 0XFF12 # Uppercase prefix and values are VALID
bin_lowercase = 0b1101
bin_uppercase = 0B1101 # Uppercase 'B' is VALID
oct_lowercase = 0o755
oct_uppercase = 0O755 # Uppercase 'O' is VALID
# --- Invalid ---
# Negative signs are NOT supported for non-decimal bases
invalid_hex = -0xFF12
invalid_oct = -0o755
invalid_bin = -0b1101
# ==========================================================
# SCIENTIFIC NOTATION (FLOATS)
# ==========================================================
# --- Valid ---
standard_sci = 1e6 # 1,000,000
positive_exp = 5e+10 # 50,000,000,000
negative_exp = -1.5E-4 # -0.00015 (Uppercase 'E' is VALID)
large_float = 6_022.140e23 # Avogadro's number
# --- Invalid ---
# Decimal points must always have digits on both sides
invalid_point_1 = .1e2
invalid_point_2 = 1.e2
# Underscores are NOT allowed in the exponent part
invalid_underscore = 1e1_0
Fixes #525