Skip to content

krcurtis/bgenpy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bgenpy

Python module to read/write BGEN files

At the most simplest, one can extract full variant data with

import bgenpy
bgen = bgenpy.Reader("my.bgen")

attributes = bgen.attributes()
samples = bgen.samples()

for info in bgen:
    print(info[0])

Read variant header information

If you want to just get a list of variants, then it is faster to read only the variant header information:

import bgenpy
bgen = bgen.Reader("my.bgen")

bgen.seek_first_variant()
try:
    while True:
        info = bgen.read_minimal_variant()
except StopIteration:
    pass

Create your own index

You can save the offsets to variants:

import bgenpy
bgen = bgen.Reader("my.bgen")

bgen.seek_first_variant()
interesting_variant_offsets = []
try:
    while True:
        offset = bgen.offset()
        info = bgen.read_minimal_variant()
        if is_interesting(info):
           interesting_variant_offsets.append(offset)
except StopIteration:
    pass

And later jump to those variants (after re-opening the file):

import bgenpy
bgen = bgen.Reader("my.bgen")

for offset in interesing_offsets:
    bgen.seek_to_variant_offset(offset)
    info = bgen.read_full_variant()

zstd library

Building this module requires the zstd compression library. Then set environment variables:

export ZSTD_INC=/to/where/is/zstd-1.1.0/lib
export ZSTD_LIB=/to/where/is/zstd-1.1.0/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ZSTD_LIB

About

Python module to read/write BGEN files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published