Skip to content

Commit

Permalink
move scripting to lua
Browse files Browse the repository at this point in the history
  • Loading branch information
brentp committed Dec 23, 2015
1 parent 89d6ba8 commit 097b0fa
Show file tree
Hide file tree
Showing 4 changed files with 96 additions and 122 deletions.
65 changes: 28 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Usage
After downloading the [binary for your system](https://github.com/brentp/vcfanno/releases/) (see section below) usage looks like:

```Shell
./vcfanno -js example/custom.js example/conf.toml example/query.vcf.gz
./vcfanno -lua example/custom.lua example/conf.toml example/query.vcf.gz
```

Where conf.toml looks like:
Expand All @@ -44,9 +44,9 @@ ops=["first", "first", "min"]
[[annotation]]
file="fitcons.bed"
columns = [4, 4]
names=["fitcons_mean", "js_sum"]
# note the 2nd op here is javascript that has access to `vals`
ops=["mean", "js:sum=0;for(i=0;i<vals.length;i++){sum+=vals[i]}; vals"]
names=["fitcons_mean", "lua_sum"]
# note the 2nd op here is lua that has access to `vals`
ops=["mean", "lua:function sum(t) local sum = 0; for i=1,#t do sum = sum + t[i] end return sum / #t end"]
[[annotation]]
file="example/ex.bam"
Expand Down Expand Up @@ -77,7 +77,7 @@ from this directory.
Then, you can annotate with:

```Shell
GOMAXPROCS=4 ./vcfanno -js example/custom.js example/conf.toml example/query.vcf.gz > annotated.vcf
GOMAXPROCS=4 ./vcfanno -lua example/custom.lua example/conf.toml example/query.vcf.gz > annotated.vcf
```

An example INFO field row before annotation (pos 98683):
Expand All @@ -87,7 +87,7 @@ AB=0.282443;ABP=56.8661;AC=11;AF=0.34375;AN=32;AO=45;CIGAR=1X;TYPE=snp

and after:
```
AB=0.2824;ABP=56.8661;AC=11;AF=0.3438;AN=32;AO=45;CIGAR=1X;TYPE=snp;AC_AFR=0;AC_AMR=0;AC_EAS=0;fitcons_mean=0.061;js_sum=0.061
AB=0.2824;ABP=56.8661;AC=11;AF=0.3438;AN=32;AO=45;CIGAR=1X;TYPE=snp;AC_AFR=0;AC_AMR=0;AC_EAS=0;fitcons_mean=0.061;lua_sum=0.061
```

Operations
Expand All @@ -98,7 +98,7 @@ in the query VCF. However, it is possible that there will be multiple annotation
from a single annotation file--in this case, the op determines how the many values
are `reduced`. Valid operations are:

+ js:$javascript // see section below for more details
+ lua:$lua // see section below for more details
+ mean
+ max
+ min
Expand Down Expand Up @@ -129,17 +129,8 @@ Development
===========

This, and the associated go libraries ([vcfgo](https://github.com/brentp/vcfgo),
[irelate](https://github.com/brentp/irelate), [xopen](https://github.com/brentp/xopen)) are
under active development. The following are on our radar (most have been completed):

- [x] allow annotating with bam fields, e.g. QUAL and SEQ.
- [ ] decompose, normalize, and get allelic primitives for variants on the fly
(we have code to do this, it just needs to be integrated)
- [ ] allow custom golang ops when using api.
- [x] improve test coverage for vcfanno (still need more tests for bam)
- [x] embed otto js engine to allow custom ops.
- [x] support for annotating BED files.

[irelate](https://github.com/brentp/irelate), [xopen](https://github.com/brentp/xopen),
[goluaez](https://github.com/brentp/goluaez) are under active development.

Additional Usage
================
Expand Down Expand Up @@ -172,38 +163,38 @@ REF/ALT are not required.
Set to the number of processes that `vcfanno` can use during annotation. `vcfanno` parallelizes well
up to 15 or so cores.

-js
---
-lua
----

custom in ops (javascript). For use when the built-in `ops` don't supply the needed reduction.
custom in ops (lua). For use when the built-in `ops` don't supply the needed reduction.

we embed the javascript engine [otto](https://github.com/robertkrimen/otto) so that it's
we embed the lua engine [go-lua](https://github.com/yuin/gopher-lua) so that it's
possible to create a custom op if it is not provided. For example if the users wants to

"js:sum=0;for(i=0;i<vals.length;i++){sum+=vals[i]};sum"
"lua:function sum(t) local sum = 0; for i=1,#t do sum = sum + t[i] end return sum end"

where the last value (in this case sum) is returned as the annotation value. It is encouraged
to instead define javascript functions in separate `.js` file and point to it when calling
`vcfanno` using the `-js` flag. So, in an external file, "some.js", instead put:

```javascript
function sum(vals) {
s = 0;
for(i=0; i<vals.length; i++){
s+=vals[i]
}
return s
}
to instead define lua functions in separate `.lua` file and point to it when calling
`vcfanno` using the `-lua` flag. So, in an external file, "some.lua", instead put:

```lua
function sum(t)
local sum = 0
for i=1,#t do
sum = sum + t[i]
end
return sum
end
```

And then the above custom op would be: "js:sum(vals)". (note that there's a sum op provided
And then the above custom op would be: "lua:sum(vals)". (note that there's a sum op provided
by `vcfanno` which will be faster).

The variables `vals`, `chrom`, `start`, `end` from the current variant will all be available
in the javascript code.
in the lua code.


See [example/conf.toml](https://github.com/brentp/vcfanno/blob/master/example/conf.toml)
and [example/custom.js](https://github.com/brentp/vcfanno/blob/master/example/custom.js)
and [example/custom.lua](https://github.com/brentp/vcfanno/blob/master/example/custom.lua)
for more examples.

115 changes: 47 additions & 68 deletions api/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ import (

"github.com/biogo/hts/sam"
"github.com/brentp/bix"
"github.com/brentp/goluaez"
"github.com/brentp/irelate/interfaces"
"github.com/brentp/irelate/parsers"
"github.com/brentp/vcfgo"
"github.com/robertkrimen/otto"
)

const LEFT = "left_"
Expand Down Expand Up @@ -42,8 +42,8 @@ type Source struct {
// 0-based index of the file order this source is from.
Index int
mu sync.Mutex
Js *otto.Script
Vm *otto.Otto
code string
Vm *goluaez.State
}

// IsNumber indicates that we expect the Source to return a number given the op
Expand All @@ -59,26 +59,18 @@ type Annotator struct {
PostAnnos []*PostAnnotation
}

// JsOp uses Otto to run a javascript snippet on a list of values and return a single value.
// It makes the chrom, start, end, and values available to the js interpreter.
func (s *Source) JsOp(v interfaces.IVariant, js *otto.Script, vals []interface{}) string {
s.mu.Lock()
s.Vm.Set("chrom", v.Chrom())
s.Vm.Set("start", v.Start())
s.Vm.Set("end", v.End())
s.Vm.Set("vals", vals)
//s.Vm.Set("info", v.Info.String())
value, err := s.Vm.Run(js)
// LuaOp uses go-lua to run a lua snippet on a list of values and return a single value.
// It makes the chrom, start, end, and values available to the lua interpreter.
func (s *Source) LuaOp(v interfaces.IVariant, code string, vals []interface{}) string {
value, err := s.Vm.Run(code, map[string]interface{}{
"chrom": v.Chrom(),
"start": v.Start(),
"end": v.End(),
"vals": vals})
if err != nil {
return fmt.Sprintf("js-error: %s", err)
return fmt.Sprintf("lua-error: %s", err)
}
val, err := value.ToString()
s.mu.Unlock()
if err != nil {
log.Println("js-error:", err)
val = fmt.Sprintf("error:%s", err)
}
return val
return fmt.Sprintf("%v", value)
}

type PostAnnotation struct {
Expand All @@ -87,17 +79,17 @@ type PostAnnotation struct {
Name string
Type string

Js *otto.Script
code string

mu sync.Mutex
Vm *otto.Otto
Vm *goluaez.State
}

// NewAnnotator returns an Annotator with the sources, seeded with some javascript.
// If ends is true, it will annotate the 1 base ends of the interval as well as the
// interval itself. If strict is true, when overlapping variants, they must share
// the ref allele and at least 1 alt allele.
func NewAnnotator(sources []*Source, js string, ends bool, strict bool, postannos []PostAnnotation) *Annotator {
func NewAnnotator(sources []*Source, lua string, ends bool, strict bool, postannos []PostAnnotation) *Annotator {
for _, s := range sources {
if e := checkSource(s); e != nil {
log.Fatal(e)
Expand All @@ -110,40 +102,31 @@ func NewAnnotator(sources []*Source, js string, ends bool, strict bool, postanno
Ends: ends,
PostAnnos: make([]*PostAnnotation, len(postannos)),
}
var err error
for i := range postannos {
postannos[i].Vm = otto.New()
if strings.HasPrefix(postannos[i].Op, "js:") {
var err error
postannos[i].Js, err = postannos[i].Vm.Compile(postannos[i].Op, postannos[i].Op[3:])
if err != nil {
log.Fatalf("error parsing customjs:%s", err)
}
postannos[i].Vm, err = goluaez.NewState(lua)
if err != nil {
log.Fatalf("error parsing custom lua:%s", err)
}
if strings.HasPrefix(postannos[i].Op, "lua:") {
postannos[i].code = postannos[i].Op[4:]
} else if _, ok := Reducers[postannos[i].Op]; !ok {
log.Fatalf("unknown op from %s: %s", postannos[i].Name, postannos[i].Op)
}
if js != "" {
_, err := postannos[i].Vm.Run(js)
if err != nil {
log.Fatalf("error parsing customjs:%s", err)
}
}
a.PostAnnos[i] = &postannos[i]
}
for _, src := range a.Sources {
src.Vm = otto.New() // create a new vm for each source and lock in the source
if strings.HasPrefix(src.Op, "js:") {
src.Vm, err = goluaez.NewState(lua) // create a new vm for each source and lock in the source
if err != nil {
log.Fatalf("error parsing custom lua:%s", err)
}
if strings.HasPrefix(src.Op, "lua:") {
var err error
src.Js, err = src.Vm.Compile(src.Op, src.Op[3:])
src.code = src.Op[4:]
if err != nil {
log.Fatalf("error parsing op: %s for file %s", src.Op, src.File)
}
}
if js != "" {
_, err := src.Vm.Run(js)
if err != nil {
log.Fatalf("error parsing customjs:%s", err)
}
}
}
return &a
}
Expand Down Expand Up @@ -299,14 +282,14 @@ func (src *Source) AnnotateOne(v interfaces.IVariant, vals []interface{}, prefix
if len(vals) == 0 {
return
}
if src.Js != nil {
jsval := src.JsOp(v, src.Js, vals)
if jsval == "true" || jsval == "false" && strings.Contains(src.Op, "_flag(") {
if jsval == "true" {
if src.code != "" {
luaval := src.LuaOp(v, src.code, vals)
if luaval == "true" || luaval == "false" && strings.Contains(src.Op, "_flag(") {
if luaval == "true" {
v.Info().Set(prefix+src.Name, true)
}
} else {
v.Info().Set(prefix+src.Name, jsval)
v.Info().Set(prefix+src.Name, luaval)
}
} else {
val := Reducers[src.Op](vals)
Expand Down Expand Up @@ -336,7 +319,7 @@ func (src *Source) UpdateHeader(r HeaderUpdater, ends bool, htype string) {
}
if (strings.HasSuffix(src.File, ".bam") && src.Field == "") || src.IsNumber() {
ntype = "Float"
} else if src.Js != nil {
} else if src.code != "" {
if strings.Contains(src.Op, "_flag(") {
ntype, number = "Flag", "0"
} else {
Expand Down Expand Up @@ -372,7 +355,7 @@ func (a *Annotator) PostAnnotate(info interfaces.Info) error {
for _, post := range a.PostAnnos {
// built in function
vals = vals[:0]
if post.Js != nil {
if post.code != "" {
for _, field := range post.Fields {
val, _ := info.Get(field)
// ignore the error as it means the field is not present.
Expand All @@ -385,30 +368,26 @@ func (a *Annotator) PostAnnotate(info interfaces.Info) error {
}
post.mu.Lock()
for i, val := range vals {
post.Vm.Set(post.Fields[i], val)
post.Vm.SetGlobal(post.Fields[i], val)
}
value, e := post.Vm.Run(post.Js)
value, e := post.Vm.Run(post.code)
post.mu.Unlock()
if e != nil {
err = e
}
val, e := value.ToString()
if e == nil {
if post.Type == "Flag" {
if !(strings.ToLower(val) == "false" || val == "0" || val == "") {
e := info.Set(post.Name, true)
if e != nil {
err = e
}
}

} else {
if e := info.Set(post.Name, val); e != nil {
val := fmt.Sprintf("%v", value)
if post.Type == "Flag" {
if !(strings.ToLower(val) == "false" || val == "0" || val == "") {
e := info.Set(post.Name, true)
if e != nil {
err = e
}
}

} else {
err = e
if e := info.Set(post.Name, val); e != nil {
err = e
}
}

} else {
Expand Down
Loading

0 comments on commit 097b0fa

Please sign in to comment.