Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloning functions #115

Open
db7 opened this issue Dec 9, 2019 · 9 comments
Open

cloning functions #115

db7 opened this issue Dec 9, 2019 · 9 comments
Labels
Milestone

Comments

@db7
Copy link
Contributor

db7 commented Dec 9, 2019

Is there a simple way how to clone functions?

I'd like to duplicate a function like this:

func duplicate(m *ir.Module, foo string) {
    for _, f := range m.Funcs {                                             
        if f.GlobalIdent.GlobalName == foo {                      
            DuplicateFunc(m, f, foo + "_v2")
        }                                              
    } 
}

And afterwards I'd do some minor changes in the function code.

@db7 db7 changed the title duplicating functions cloning functions Dec 9, 2019
@mewmew
Copy link
Member

mewmew commented Dec 9, 2019

Hi @db7,

The way to clone functions would be to do a deep copy of the *ir.Func structure. There should be several Go packages which provide deep copy (e.g. mohae/deepcopy), so use the one you prefer.

That's at least in theory. In practice, when I tried this out, there were quite a few issues. I'll post the source code I've been using to test this so you can get more insight into the various approaches tried. In the end, we probably need to do this manually for the ir structures to resolve issues with cyclic references, etc.

Given the LLVM IR assembly file foo.ll:

define i32 @f(i32 %x, i32 %y) {
	%result = add i32 %x, %y
	ret i32 %result
}

The following source code may be used to clone the function @f to a new function @g, and made independent modifications to @g. The code below is a proof of concept to get a feel for what approach may be taken.

// Note, the proper solution would be to do something like
// https://godoc.org/github.com/go-toolsmith/astcopy but for llir/llvm/ir, e.g.
// ircopy.

package main

import (
	//"bytes"
	//"encoding/json" // unmarshal error: `json: cannot unmarshal object into Go struct field FuncType.Sig.RetType of type types.Type`
	"fmt"
	"log"

	"github.com/llir/llvm/asm"
	"github.com/llir/llvm/ir"
	//"github.com/mohae/deepcopy" // this library didn't work as we get an infinite recursion during deepcopy.Copy; `goroutine stack exceeds 1000000000-byte limit`
	//"github.com/ybriffa/deepcopy" // this library didn't work as it cannot handle enums; `value of type uint8 cannot be converted to type ir.FuncAttribute`
	//"github.com/rogpeppe/rog-go/exp/deepcopy" // cannot use this library, build error; deepcopy.go:218:25: undefined: unsafe.Unreflect
	//"github.com/jinzhu/copier" // seems to work (but the copy is shallow). however, not deep copy, so changes to instructions are mirrored to both functions.
	//"github.com/mitchellh/copystructure" // this library didn't work as we get an infinite recursion during copystructure.Copy; `goroutine stack exceeds 1000000000-byte limit`
	//"github.com/getlantern/deepcopy" // unmarshal error: `json: cannot unmarshal object into Go struct field FuncType.Sig.RetType of type types.Type`
)

func main() {
	// parse LLVM IR module.
	m, err := asm.ParseFile("foo.ll")
	if err != nil {
		log.Fatalf("%+v", err)
	}
	// locate function "@f".
	const funcName = "f"
	f, ok := findFunc(m, funcName)
	if !ok {
		log.Fatalf("unable to locate function %q", funcName)
	}
	// create clone of function "@f" with new function name "@g".
	g, err := dupFunc(f, "g")
	if err != nil {
		log.Fatalf("unable to clone function %q; %v", f.Name(), err)
	}
	modifyFunc(g)
	// append new function "@g" to LLVM IR module.
	m.Funcs = append(m.Funcs, g)
	// print LLVM IR module to standard output.
	fmt.Println(m)
}

// modifyFunc modified g by setting the name of each basic block and swapping
// the order of operands to add instructions.
func modifyFunc(g *ir.Func) *ir.Func {
	for i, block := range g.Blocks {
		blockName := fmt.Sprintf("block_%d", i)
		block.SetName(blockName)
		for _, inst := range block.Insts {
			switch inst := inst.(type) {
			case *ir.InstAdd:
				// swap X and Y.
				inst.X, inst.Y = inst.Y, inst.X
			}
		}
	}
	return g
}

// findFunc tries to locate the function with the given name in the LLVM IR
// module. The boolean return value indicates success.
func findFunc(m *ir.Module, funcName string) (*ir.Func, bool) {
	for _, f := range m.Funcs {
		if f.Name() == funcName {
			return f, true
		}
	}
	return nil, false
}

// dupFunc returns a clone of the given function with a new function name.
func dupFunc(origFunc *ir.Func, newFuncName string) (*ir.Func, error) {
	// Attempt using deepcopy.
	//
	//newFunc := deepcopy.Copy(origFunc).(*ir.Func)

	// Attempt using copier.
	//
	//newFunc := new(ir.Func)
	//if err := copier.Copy(newFunc, origFunc); err != nil {
	//	return nil, err
	//}

	// Attempt using copystructure.
	//v, err := copystructure.Copy(origFunc)
	//if err != nil {
	//	return nil, err
	//}
	//newFunc := v.(*ir.Func)

	// Use json marshal/unmarshal as workaround for deep copy.
	// This should work but may be slow.
	//newFunc := new(ir.Func)
	//if err := deepcopyWorkaround(newFunc, origFunc); err != nil {
	//	return nil, err
	//}

	// Attempt using getlantern/deepcopy.
	//newFunc := new(ir.Func)
	//if err := deepcopy.Copy(newFunc, origFunc); err != nil {
	//	return nil, err
	//}

	newFunc := cloneFunc(origFunc)

	newFunc.SetName(newFuncName)
	return newFunc, nil
}

func cloneFunc(origFunc *ir.Func) *ir.Func {
	newFunc := new(ir.Func)
	*newFunc = *origFunc
	newFunc.Blocks = make([]*ir.Block, len(origFunc.Blocks))
	for i, origBlock := range origFunc.Blocks {
		newBlock := cloneBlock(origBlock)
		newFunc.Blocks[i] = newBlock
	}
	return newFunc
}

func cloneBlock(origBlock *ir.Block) *ir.Block {
	newBlock := new(ir.Block)
	*newBlock = *origBlock
	newBlock.Insts = make([]ir.Instruction, len(origBlock.Insts))
	for j, origInst := range origBlock.Insts {
		newInst := cloneInst(origInst)
		newBlock.Insts[j] = newInst
	}
	newTerm := cloneTerm(origBlock.Term)
	newBlock.Term = newTerm
	return newBlock
}

func cloneInst(origInst ir.Instruction) ir.Instruction {
	var newInst ir.Instruction
	switch origInst := origInst.(type) {
	case *ir.InstAdd:
		v := new(ir.InstAdd)
		*v = *origInst
		newInst = v
	default:
		panic(fmt.Errorf("support for instruction %T not yet implemented", origInst))
	}
	return newInst
}

func cloneTerm(origTerm ir.Terminator) ir.Terminator {
	var newTerm ir.Terminator
	switch origTerm := origTerm.(type) {
	case *ir.TermRet:
		v := new(ir.TermRet)
		*v = *origTerm
		newTerm = v
	default:
		panic(fmt.Errorf("support for terminator %T not yet implemented", origTerm))
	}
	return newTerm
}

/*
func deepcopyWorkaround(dst, src interface{}) error {
	buf := &bytes.Buffer{}
	enc := json.NewEncoder(buf)
	if err := enc.Encode(src); err != nil {
		return err
	}
	dec := json.NewDecoder(buf)
	if err := dec.Decode(dst); err != nil {
		return err
	}
	return nil
}
*/

@mewmew mewmew added this to the util milestone Dec 16, 2019
@mewmew mewmew added the util label Dec 16, 2019
@db7
Copy link
Contributor Author

db7 commented Dec 19, 2019

@mewmew, thanks for the detailed reply. Indeed, I got issues with infinite recursions with all approaches you have tried (including json encoding).

I've noticed you marked this issue with "util". Do you intend to implement something on this direction as part of this project?

@mewmew
Copy link
Member

mewmew commented Dec 19, 2019

@mewmew, thanks for the detailed reply. Indeed, I got issues with infinite recursions with all approaches you have tried (including json encoding).

I know, the self-referential values are problematic. We need to do some brain storming and figure out how to implement this in a way that can handle such recursions.

I've noticed you marked this issue with "util". Do you intend to implement something on this direction as part of this project?

Indeed, the intention is to implement it as part of the irutil repository. Having the ability to clone and arbitrarily modify functions is functionality that will be of common use and thus belongs in irutil.

However, I still don't know of a good way to implement this, so will leave this issue open to invite others to contribute to the discussion. Once we've determined how to implement this, I'd be more than happy to put a solid implementation in irutil.

So for now, lets continue to brain storm and try different approaches. Let me know if you find or think of something that could work!

@dannypsnl
Copy link
Member

Looks into the LLVM implementation: https://llvm.org/doxygen/IndirectionUtils_8cpp_source.html#l00286, it basically creates new instances from old values one by one. I also have no idea how to clone automatically correctly.

@mewmew
Copy link
Member

mewmew commented Jan 20, 2020

I haven't tested it yet but the deep-copy project was recently announced, which generates deep copy functions for a given type. With some luck it could suit our needs.

@dannypsnl
Copy link
Member

Do we have any updates here?

@mewmew
Copy link
Member

mewmew commented Jun 13, 2021

Do we have any updates here?

Not yet. Deep copy is difficult to handle correctly. And writing it manually will be difficult to update whenever we update the llir/llvm/ir struct representation. Therefore, we would like to find an automated approach. Either by code generation (after parsing the type definitions of llir/llvm/ir) or by using reflection (e.g. as is done in the deep-copy library).

More experimentation is needed to figure out what approach works best, both for future maintenance, and for other aspects such as preventing infinite cyclic recursion and also in part performance.

@dannypsnl
Copy link
Member

https://github.com/jinzhu/copier have a try?

@mewmew
Copy link
Member

mewmew commented Jul 31, 2021

https://github.com/jinzhu/copier have a try?

@db7, have you tried using the copier package for deep copy? I think it should be possible to adapt the code example outlined in #115 (comment) to verify if this library works or not.

@dannypsnl, if you wish, you can also try to adapt the code example of #115 (comment) to see if the copier library works. Most deep copy libraries we've tried got stuck with infinite recursion problems.

I'm out hiking, and had a stop at a local cafe :)

Wish you both a good summer vacation!

Cheers,
Robin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants