Skip to content

string.drop_start behaves wrong on the JS target #924

@Nicd

Description

@Nicd
import gleam/io
import gleam/string

pub fn main() {
  io.println(string.drop_start("广州abcdefghijklmn", 0))
  io.println(string.drop_start("广州abcdefghijklmn", 1))
  io.println(string.drop_start("广州abcdefghijklmn", 2))
  io.println(string.drop_start("广州abcdefghijklmn", 3))
}

outputs on the JS target:

广州abcdefghijklmn
bcdefghijklmn
efghijklmn
fghijklmn

So the first two characters are counted as 3 each. unsafe_byte_slice is used here with byte offsets:

unsafe_byte_slice(string, prefix_size, byte_size(string) - prefix_size)

It calls string_byte_slice, which contrary to its name does not operate on bytes but UTF-16 code units:

export function string_byte_slice(string, index, length) {

Thus the wrong offsets are sliced from the string.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions