Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msgpack: optimize #95

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 11 additions & 5 deletions msgpack/field.go
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@ func collectFields(fields []*field, t reflect.Type, visited map[reflect.Type]boo
}

if len(index) == d {
_ = fields[len(fields)-1] // bounds check hint to compiler

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on that? I've seen bounds checks (namely the std lib) to guard a specific length of a slice, but not ever like this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit is very old so I don't remember that, but IIRC scope is for i := 0; i < len(fields); i++.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but when do you expect len(fields) to go out of bounds? Usually I've seen bound checks like these guard a slice of unknown length when we're explicitly accessing certain indexes like this:

func someFunc(s []string) {
  _ = s[1]
  s[0] = "hey"
  s[1] = "test!"
}

While what you've done here is you're not guarding an explicit length like: _ = fields[3] but guarding the size which is restricted by the language itself, that is: the last index has to be len(slice)-1, there's no other way. It seems completely redundant to me and confusing.

Am I missing something? :)

Copy link
Member Author

@zchee zchee Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll re-detect where is BCE optimized using -gcflags='-d=ssa/check_bce/debug=3'.

But in context of language specifications, the fields value is not array, actual type is slice. It means, the gc compiler (still can't optimization? I don't know) might be imagine will growing slice. And actually we also can grow fields slice after the BCE'ed place.

e.g., encoding/binary package is return byte slice immediately after the BCE place, but this msgpack packages logic is used for loop (and etc) after the BCE place, therefore works BCE optimization. (above is my current imagine. I'll verify again that the BCE is working properly when I have time anyway.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting!

Still, fields is not local to the collect function and might be manipulated after you do the bounds check so you won't really eliminate the BCE done on fields[j] = fields[i], I've verified that with this whole file:

  • before manual bounds check:
$ go run -gcflags="-d=ssa/check_bce/debug=1" field.go
# command-line-arguments
./field.go:83:12: Found IsInBounds
./field.go:87:19: Found IsSliceInBounds
./field.go:100:10: Found IsInBounds
  • after:
> go run -gcflags="-d=ssa/check_bce/debug=1" field.go
# command-line-arguments
./field.go:78:17: Found IsInBounds
./field.go:84:12: Found IsInBounds
./field.go:88:19: Found IsSliceInBounds
./field.go:101:10: Found IsInBounds

You can see you've just introduced an extra check there and haven't got rid of the ones performed in the loop.

Correct me If I'm wrong and thanks a lot for answering, this is quiet interesting! :)


// There is another field with same name and same depth
// Remove that field and skip this field
j := 0
Expand Down Expand Up @@ -101,12 +103,16 @@ func collectFields(fields []*field, t reflect.Type, visited map[reflect.Type]boo
// Parse empty field tag
if e := sf.Tag.Get("empty"); e != "" {
switch sf.Type.Kind() {
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
bits := 0
if sf.Type.Kind() != reflect.Int {
bits = sf.Type.Bits()
case reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
v, err := strconv.ParseInt(e, 10, sf.Type.Bits())
if err != nil {
panic(fmt.Errorf("msgpack: error parsing field empty field %s.%s: %w", t.Name(), sf.Name, err))
}
v, err := strconv.ParseInt(e, 10, bits)
f.empty = reflect.New(sf.Type).Elem()
f.empty.SetInt(v)

case reflect.Int:
v, err := strconv.ParseInt(e, 10, 0)
if err != nil {
panic(fmt.Errorf("msgpack: error parsing field empty field %s.%s: %w", t.Name(), sf.Name, err))
}
Expand Down
222 changes: 111 additions & 111 deletions msgpack/unpack.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,13 @@ func (t Type) String() string {
// ErrDataSizeTooLarge is the data size too large error.
var ErrDataSizeTooLarge = errors.New("msgpack: data size too large")

// ExtensionMap specifies functions for converting MessagePack extensions to Go
// values.
//
// The key is the MessagePack extension type.
// The value is a function that converts the extension data to a Go value.
type ExtensionMap map[int]func([]byte) (interface{}, error)

// Decoder reads MessagePack objects from an io.Reader.
type Decoder struct {
extensions ExtensionMap
Expand All @@ -77,13 +84,6 @@ func NewDecoder(r io.Reader) *Decoder {
}
}

// ExtensionMap specifies functions for converting MessagePack extensions to Go
// values.
//
// The key is the MessagePack extension type.
// The value is a function that converts the extension data to a Go value.
type ExtensionMap map[int]func([]byte) (interface{}, error)

// SetExtensions specifies functions for converting MessagePack extensions to Go
// values.
func (d *Decoder) SetExtensions(extensions ExtensionMap) {
Expand Down Expand Up @@ -172,7 +172,7 @@ func (d *Decoder) Unpack() error {
f := formats[code]
d.t = f.t

d.n, err = f.n(d, code)
d.n, err = f.fn(d, code)
if err != nil {
return d.fatal(err)
}
Expand Down Expand Up @@ -243,185 +243,185 @@ func (d *Decoder) skipCount() int {

var formats = [256]*struct {
t Type
n func(d *Decoder, code byte) (uint64, error)
fn func(d *Decoder, code byte) (uint64, error)
more bool
}{
fixIntCodeMin: {
t: Int,
n: func(d *Decoder, code byte) (uint64, error) { return uint64(code), nil },
t: Int,
fn: func(d *Decoder, code byte) (uint64, error) { return uint64(code), nil },
},
fixMapCodeMin: {
t: MapLen,
n: func(d *Decoder, code byte) (uint64, error) { return uint64(code) - uint64(fixMapCodeMin), nil },
t: MapLen,
fn: func(d *Decoder, code byte) (uint64, error) { return uint64(code) - uint64(fixMapCodeMin), nil },
},
fixArrayCodeMin: {
t: ArrayLen,
n: func(d *Decoder, code byte) (uint64, error) { return uint64(code) - uint64(fixArrayCodeMin), nil },
t: ArrayLen,
fn: func(d *Decoder, code byte) (uint64, error) { return uint64(code) - uint64(fixArrayCodeMin), nil },
},
fixStringCodeMin: {
t: String,
n: func(d *Decoder, code byte) (uint64, error) { return uint64(code) - uint64(fixStringCodeMin), nil },
fn: func(d *Decoder, code byte) (uint64, error) { return uint64(code) - uint64(fixStringCodeMin), nil },
more: true,
},
negFixIntCodeMin: {
t: Int,
n: func(d *Decoder, code byte) (uint64, error) { return uint64(int64(int8(code))), nil },
},
nilCode: {
t: Nil,
n: func(d *Decoder, code byte) (uint64, error) { return 0, nil },
t: Nil,
fn: func(d *Decoder, code byte) (uint64, error) { return 0, nil },
},
unusedCode: {
t: Invalid,
fn: func(d *Decoder, code byte) (uint64, error) {
return 0, fmt.Errorf("msgpack: unknown format code %x", code)
},
},
falseCode: {
t: Bool,
n: func(d *Decoder, code byte) (uint64, error) { return 0, nil },
t: Bool,
fn: func(d *Decoder, code byte) (uint64, error) { return 0, nil },
},
trueCode: {
t: Bool,
n: func(d *Decoder, code byte) (uint64, error) { return 1, nil },
t: Bool,
fn: func(d *Decoder, code byte) (uint64, error) { return 1, nil },
},
binary8Code: {
t: Binary,
fn: (*Decoder).read1,
more: true,
},
binary16Code: {
t: Binary,
fn: (*Decoder).read2,
more: true,
},
binary32Code: {
t: Binary,
fn: (*Decoder).read4,
more: true,
},
ext8Code: {
t: Extension,
fn: (*Decoder).read1,
more: true,
},
ext16Code: {
t: Extension,
fn: (*Decoder).read2,
more: true,
},
ext32Code: {
t: Extension,
fn: (*Decoder).read4,
more: true,
},
float32Code: {
t: Float,
n: func(d *Decoder, code byte) (uint64, error) {
fn: func(d *Decoder, code byte) (uint64, error) {
n, err := d.read4(code)
return math.Float64bits(float64(math.Float32frombits(uint32(n)))), err
},
},
float64Code: {
t: Float,
n: (*Decoder).read8,
t: Float,
fn: (*Decoder).read8,
},
uint8Code: {
t: Uint,
n: (*Decoder).read1,
t: Uint,
fn: (*Decoder).read1,
},
uint16Code: {
t: Uint,
n: (*Decoder).read2,
t: Uint,
fn: (*Decoder).read2,
},
uint32Code: {
t: Uint,
n: (*Decoder).read4,
t: Uint,
fn: (*Decoder).read4,
},
uint64Code: {
t: Uint,
n: (*Decoder).read8,
t: Uint,
fn: (*Decoder).read8,
},
int8Code: {
t: Int,
n: func(d *Decoder, code byte) (uint64, error) {
fn: func(d *Decoder, code byte) (uint64, error) {
n, err := d.read1(code)
return uint64(int64(int8(n))), err
},
},
int16Code: {
t: Int,
n: func(d *Decoder, code byte) (uint64, error) {
fn: func(d *Decoder, code byte) (uint64, error) {
n, err := d.read2(code)
return uint64(int64(int16(n))), err
},
},
int32Code: {
t: Int,
n: func(d *Decoder, code byte) (uint64, error) {
fn: func(d *Decoder, code byte) (uint64, error) {
n, err := d.read4(code)
return uint64(int64(int32(n))), err
},
},
int64Code: {
t: Int,
n: (*Decoder).read8,
},
string8Code: {
t: String,
n: (*Decoder).read1,
more: true,
},
string16Code: {
t: String,
n: (*Decoder).read2,
more: true,
},
string32Code: {
t: String,
n: (*Decoder).read4,
more: true,
},
binary8Code: {
t: Binary,
n: (*Decoder).read1,
more: true,
},
binary16Code: {
t: Binary,
n: (*Decoder).read2,
more: true,
},
binary32Code: {
t: Binary,
n: (*Decoder).read4,
more: true,
},
array16Code: {
t: ArrayLen,
n: (*Decoder).read2,
},
array32Code: {
t: ArrayLen,
n: (*Decoder).read4,
},
map16Code: {
t: MapLen,
n: (*Decoder).read2,
},
map32Code: {
t: MapLen,
n: (*Decoder).read4,
t: Int,
fn: (*Decoder).read8,
},
fixExt1Code: {
t: Extension,
n: func(d *Decoder, code byte) (uint64, error) { return 1, nil },
fn: func(d *Decoder, code byte) (uint64, error) { return 1, nil },
more: true,
},
fixExt2Code: {
t: Extension,
n: func(d *Decoder, code byte) (uint64, error) { return 2, nil },
fn: func(d *Decoder, code byte) (uint64, error) { return 2, nil },
more: true,
},
fixExt4Code: {
t: Extension,
n: func(d *Decoder, code byte) (uint64, error) { return 4, nil },
fn: func(d *Decoder, code byte) (uint64, error) { return 4, nil },
more: true,
},
fixExt8Code: {
t: Extension,
n: func(d *Decoder, code byte) (uint64, error) { return 8, nil },
fn: func(d *Decoder, code byte) (uint64, error) { return 8, nil },
more: true,
},
fixExt16Code: {
t: Extension,
n: func(d *Decoder, code byte) (uint64, error) { return 16, nil },
fn: func(d *Decoder, code byte) (uint64, error) { return 16, nil },
more: true,
},
ext8Code: {
t: Extension,
n: (*Decoder).read1,
string8Code: {
t: String,
fn: (*Decoder).read1,
more: true,
},
ext16Code: {
t: Extension,
n: (*Decoder).read2,
string16Code: {
t: String,
fn: (*Decoder).read2,
more: true,
},
ext32Code: {
t: Extension,
n: (*Decoder).read4,
string32Code: {
t: String,
fn: (*Decoder).read4,
more: true,
},
unusedCode: {
t: Invalid,
n: func(d *Decoder, code byte) (uint64, error) {
return 0, fmt.Errorf("msgpack: unknown format code %x", code)
},
array16Code: {
t: ArrayLen,
fn: (*Decoder).read2,
},
array32Code: {
t: ArrayLen,
fn: (*Decoder).read4,
},
map16Code: {
t: MapLen,
fn: (*Decoder).read2,
},
map32Code: {
t: MapLen,
fn: (*Decoder).read4,
},
negFixIntCodeMin: {
t: Int,
fn: func(d *Decoder, code byte) (uint64, error) { return uint64(int64(int8(code))), nil },
},
}

Expand Down Expand Up @@ -457,13 +457,13 @@ func (d *Decoder) fatal(err error) error {
return err
}

func (d *Decoder) read1(format byte) (uint64, error) {
func (d *Decoder) read1(byte) (uint64, error) {
b, err := d.r.ReadByte()

return uint64(b), err
}

func (d *Decoder) read2(format byte) (uint64, error) {
func (d *Decoder) read2(byte) (uint64, error) {
p, err := d.r.Peek(2)
if err != nil {
return 0, err
Expand All @@ -473,7 +473,7 @@ func (d *Decoder) read2(format byte) (uint64, error) {
return uint64(p[1]) | uint64(p[0])<<8, nil
}

func (d *Decoder) read4(format byte) (uint64, error) {
func (d *Decoder) read4(byte) (uint64, error) {
p, err := d.r.Peek(4)
if err != nil {
return 0, err
Expand All @@ -483,7 +483,7 @@ func (d *Decoder) read4(format byte) (uint64, error) {
return uint64(p[3]) | uint64(p[2])<<8 | uint64(p[1])<<16 | uint64(p[0])<<24, nil
}

func (d *Decoder) read8(format byte) (uint64, error) {
func (d *Decoder) read8(byte) (uint64, error) {
p, err := d.r.Peek(8)
if err != nil {
return 0, err
Expand Down