File tree 1 file changed +8
-4
lines changed
1 file changed +8
-4
lines changed Original file line number Diff line number Diff line change @@ -2450,10 +2450,14 @@ \section{Name tokenisation codec}
2450
2450
a format within a format, as the multiple byte streams $ B_{pos,type}$
2451
2451
are serialised into a single byte stream.
2452
2452
2453
- The serialised data stream starts with two unsigned little endiand 32-bit
2454
- integers holding the total size of uncompressed name buffer and the
2455
- number of read names. This is followed the array elements
2456
- themselves.
2453
+ The serialised data stream starts with two unsigned little endian
2454
+ 32-bit integers holding the total size of uncompressed name buffer and
2455
+ the number of read names. This is followed the array elements
2456
+ themselves. Note the uncompressed size is calculated as the sum of
2457
+ all name lengths including a termination byte per name (e.g. the nul
2458
+ char). This is irrespective of whether the implementation produces
2459
+ data in this form or whether it returns separate name and name-length
2460
+ arrays.
2457
2461
2458
2462
Token types, $ ttype$ holds one of the token ID values listed above
2459
2463
in the list above, plus special values to indicate certain additional
You can’t perform that action at this time.
0 commit comments