Skip to content

Commit

Permalink
avoid breaking intrinsics when obfuscating names
Browse files Browse the repository at this point in the history
We obfuscate import paths as well as their declared names.
The compiler treats some packages and APIs in special ways,
and the way it detects those is by looking at import paths and names.

In the past, we have avoided obfuscating some names like embed.FS or
reflect.Value.MethodByName for this reason. Otherwise,
go:embed or the linker's deadcode elimination might be broken.

This matching by path and name also happens with compiler intrinsics.
Intrinsics allow the compiler to rewrite some standard library calls
with small and efficient assembly, depending on the target GOARCH.
For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32,
which on amd64 may result in using the TZCNTL instruction.

We never noticed that we were breaking many of these intrinsics.
The intrinsics for funcs declared in the runtime and its dependencies
still worked properly, as we do not obfuscate those packages yet.
However, for other packages like math/bits and sync/atomic,
the intrinsics were being entirely disabled due to obfuscated names.

Skipping intrinsics is particularly bad for performance,
and it also leads to slightly larger binaries:

			 │      old      │                 new                 │
			 │     bin-B     │     bin-B      vs base              │
	Build-16   5.450Mi ± ∞ ¹   5.333Mi ± ∞ ¹  -2.15% (p=0.029 n=4)

Finally, the main reason we noticed that intrinsics were broken
is that apparently GOARCH=mips fails to link without them,
as some symbols end up being not defined at all.
This patch fixes builds for the MIPS family of architectures.

Rather than building and linking all of std for every GOARCH,
test that intrinsics work by asking the compiler to print which
intrinsics are being applied, and checking that math/bits gets them.

This fix is relatively unfortunate, as it means we stop obfuscating
about 120 function names and a handful of package paths.
However, fixing builds and intrinsics is much more important.
We can figure out better ways to deal with intrinsics in the future.

Fixes #646.
  • Loading branch information
mvdan committed Jan 26, 2023
1 parent 598d518 commit 0ec363d
Show file tree
Hide file tree
Showing 5 changed files with 191 additions and 10 deletions.
145 changes: 144 additions & 1 deletion go_std_tables.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 14 additions & 6 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -839,7 +839,7 @@ func replaceAsmNames(buf *bytes.Buffer, remaining []byte) {
name := string(remaining[:nameEnd])
remaining = remaining[nameEnd:]

if lpkg.ToObfuscate {
if lpkg.ToObfuscate && !compilerIntrinsicsFuncs[lpkg.ImportPath+"."+name] {
newName := hashWithPackage(lpkg, name)
if flagDebug { // TODO(mvdan): remove once https://go.dev/issue/53465 if fixed
log.Printf("asm name %q hashed with %x to %q", name, curPkg.GarbleActionID, newName)
Expand Down Expand Up @@ -959,7 +959,11 @@ func transformCompile(args []string) ([]string, error) {
}
tf.handleDirectives(file.Comments)
file = tf.transformGoFile(file)
if newPkgPath != "" {
// newPkgPath might be the original ImportPath in some edge cases like
// compilerIntrinsics; we don't want to use slashes in package names.
// TODO: when we do away with those edge cases, only check the string is
// non-empty.
if newPkgPath != "" && newPkgPath != curPkg.ImportPath {
file.Name.Name = newPkgPath
}

Expand Down Expand Up @@ -1052,7 +1056,7 @@ func (tf *transformer) handleDirectives(comments []*ast.CommentGroup) {

func (tf *transformer) transformLinkname(localName, newName string) (string, string) {
// obfuscate the local name, if the current package is obfuscated
if curPkg.ToObfuscate {
if curPkg.ToObfuscate && !compilerIntrinsicsFuncs[curPkg.ImportPath+"."+localName] {
localName = hashWithPackage(curPkg, localName)
}
if newName == "" {
Expand Down Expand Up @@ -1092,7 +1096,7 @@ func (tf *transformer) transformLinkname(localName, newName string) (string, str
}
panic(err) // shouldn't happen
}
if lpkg.ToObfuscate {
if lpkg.ToObfuscate && !compilerIntrinsicsFuncs[lpkg.ImportPath+"."+foreignName] {
// The name exists and was obfuscated; obfuscate the new name.
newForeignName := hashWithPackage(lpkg, foreignName)
newPkgPath := pkgPath
Expand Down Expand Up @@ -1825,7 +1829,8 @@ func (tf *transformer) transformGoFile(file *ast.File) *ast.File {

// The Go toolchain needs to detect symbols from these packages,
// so we are not obfuscating their package paths or declared names.
switch pkg.Path() {
path := pkg.Path()
switch path {
case "embed":
// FS is detected by the compiler for //go:embed.
return name == "FS"
Expand All @@ -1849,7 +1854,6 @@ func (tf *transformer) transformGoFile(file *ast.File) *ast.File {
return true
}

path := pkg.Path()
lpkg, err := listPackage(path)
if err != nil {
panic(err) // shouldn't happen
Expand Down Expand Up @@ -1895,6 +1899,10 @@ func (tf *transformer) transformGoFile(file *ast.File) *ast.File {
case *types.TypeName:
debugName = "type"
case *types.Func:
if compilerIntrinsicsFuncs[path+"."+name] {
return true
}

sign := obj.Type().(*types.Signature)
if sign.Recv() == nil {
debugName = "func"
Expand Down
20 changes: 19 additions & 1 deletion scripts/gen-go-std-tables.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# We can rewrite this bash script in Go if a dependency on bash and coreutils
# is a problem during development.

goroot=$(go env GOROOT)
go_version=$(go env GOVERSION) # not "go version", to exclude GOOS/GOARCH

runtime_and_deps=$(go list -deps runtime)
Expand All @@ -11,12 +12,17 @@ runtime_and_deps=$(go list -deps runtime)
# This resulting list is what we need to "go list" when obfuscating the runtime,
# as they are the packages that we may be missing.
runtime_linknamed=$(comm -23 <(
sed -rn 's@//go:linkname .* ([^.]*)\.[^.]*@\1@p' $(go env GOROOT)/src/runtime/*.go | grep -vE '^main|^runtime\.' | sort -u
sed -rn 's@//go:linkname .* ([^.]*)\.[^.]*@\1@p' "${goroot}"/src/runtime/*.go | grep -vE '^main|^runtime\.' | sort -u
) <(
# Note that we assume this is constant across platforms.
go list -deps runtime | sort -u
))

compiler_intrinsics_table="$(sed -rn 's@.*\b(addF|alias)\("([^"]*)", "([^"]*)",.*@\2 \3@p' "${goroot}"/src/cmd/compile/internal/ssagen/ssa.go | sort -u)"
compiler_intrinsics_paths="$(while read path name; do
echo ${path}
done <<<"${compiler_intrinsics_table}" | sort -u)"

gofmt >go_std_tables.go <<EOF
// Code generated by scripts/gen-go-std-tables.sh; DO NOT EDIT.
Expand All @@ -35,4 +41,16 @@ $(for path in ${runtime_linknamed}; do
echo "\"${path}\"",
done)
}
var compilerIntrinsicsPkgs = map[string]bool{
$(for path in ${compiler_intrinsics_paths}; do
echo "\"${path}\": true,"
done)
}
var compilerIntrinsicsFuncs = map[string]bool{
$(while read path name; do
echo "\"${path}.${name}\": true,"
done <<<"${compiler_intrinsics_table}")
}
EOF
3 changes: 3 additions & 0 deletions shared.go
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,9 @@ func (p *listedPackage) obfuscatedImportPath() string {
case "runtime", "reflect", "embed":
return p.ImportPath
}
if compilerIntrinsicsPkgs[p.ImportPath] {
return p.ImportPath
}
if !p.ToObfuscate {
return p.ImportPath
}
Expand Down
13 changes: 11 additions & 2 deletions testdata/script/crossbuild.txtar
Original file line number Diff line number Diff line change
@@ -1,15 +1,24 @@
# A fairly average Go build, importing some std libraries.
# We always build for a foreign GOOS.
# GOOS=windows, unless the host is also windows; then linux.
# GOARCH=arm, unless the host is also arm; then amd64.
# Windows and ARM are both interesting,
# and it helps with coverage as we mainly test on linux/amd64.
#
# We also ensure that intrinsics work as expected.
# The compiler replaces calls to some functions with intrinsics in its ssa stage,
# and it recognizes which functions via the package path and func name.
# If we obfuscate those package paths without adjusting the compiler,
# intrinsics aren't applied, causing performance loss or build errors.
# We use the math/bits package, as its Len64 intrinsic is present in both arm
# and arm64, and it is not part of the runtime nor its dependencies.
[!windows] env GOOS=windows
[windows] env GOOS=linux
[!arm] env GOARCH=arm
[arm] env GOARCH=arm64
garble build -gcflags=math/bits=-d=ssa/intrinsics/debug=1
stderr 'intrinsic substitution for Len64.*BitLen64'

# A fairly average Go build, importing some std libraries.
garble build
-- go.mod --
module test/main

Expand Down

0 comments on commit 0ec363d

Please sign in to comment.