Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rootpy.io.file.walk gets slower the longer you "walk" #809

Open
burneyy opened this issue Aug 21, 2019 · 1 comment
Open

rootpy.io.file.walk gets slower the longer you "walk" #809

burneyy opened this issue Aug 21, 2019 · 1 comment

Comments

@burneyy
Copy link

burneyy commented Aug 21, 2019

Hi everybody,

I am not sure if the following issue is actually related to rootpy, or a general (PY)ROOT problem - but I would be in any case very grateful for ideas!

Consider the following code (you can find the script and the corresponding test ROOT-file scurves.root here):

#!/usr/bin/env python
from rootpy.io import root_open
import rootpy.ROOT as ROOT
import time

_print_interval = 1000
_last_time = time.time()
_obj_cnt = 0
def print_time_past():
    global _print_interval, _last_time, _obj_cnt
    _obj_cnt += 1
    if _obj_cnt%_print_interval == 0:
        ms_past = (time.time()-_last_time)*1000.
        print(f"Took {ms_past:.0f}ms for the last {_print_interval} objects.")
        _last_time = time.time()

def walk():
    fname = "scurves.root"

    with root_open(fname, "READ") as root_file:
        for path, dirs, objects in root_file.walk(""):
            for obj_name in objects:
                obj = root_file[path+"/"+obj_name] #This line is critical!
                #Do something with object
                print_time_past()

walk()

You will notice that, when running it, it will take more and more time to walk through batches of 1000 objects, producing an output like this:

Took 3816ms for the last 1000 objects.
Took 2334ms for the last 1000 objects.
Took 2174ms for the last 1000 objects.
Took 2253ms for the last 1000 objects.
Took 2355ms for the last 1000 objects.
Took 2582ms for the last 1000 objects.
Took 2583ms for the last 1000 objects.
Took 2652ms for the last 1000 objects.
Took 2750ms for the last 1000 objects.
Took 2909ms for the last 1000 objects.
Took 2778ms for the last 1000 objects.
Took 2902ms for the last 1000 objects.
Took 3109ms for the last 1000 objects.
Took 3269ms for the last 1000 objects.
Took 3412ms for the last 1000 objects.
Took 3410ms for the last 1000 objects.
Took 3880ms for the last 1000 objects.
Took 4391ms for the last 1000 objects.
Took 4919ms for the last 1000 objects.
Took 4758ms for the last 1000 objects.
Took 4143ms for the last 1000 objects.
Took 4400ms for the last 1000 objects.
Took 4303ms for the last 1000 objects.
Took 4446ms for the last 1000 objects.
Took 4438ms for the last 1000 objects.
Took 5050ms for the last 1000 objects.
Took 4827ms for the last 1000 objects.
Took 5275ms for the last 1000 objects.
Took 6279ms for the last 1000 objects.
Took 5203ms for the last 1000 objects.
Took 5862ms for the last 1000 objects.
Took 5982ms for the last 1000 objects.
Took 6300ms for the last 1000 objects.
Took 5883ms for the last 1000 objects.
Took 6926ms for the last 1000 objects.
Took 8219ms for the last 1000 objects.
Took 7514ms for the last 1000 objects.
Took 6585ms for the last 1000 objects.
Took 6371ms for the last 1000 objects.
Took 6994ms for the last 1000 objects.
Took 8177ms for the last 1000 objects.
Took 8137ms for the last 1000 objects.
Took 7065ms for the last 1000 objects.
Took 7374ms for the last 1000 objects.
Took 7479ms for the last 1000 objects.
Took 7564ms for the last 1000 objects.
Took 8076ms for the last 1000 objects.
Took 9441ms for the last 1000 objects.
Took 8773ms for the last 1000 objects.
Took 10552ms for the last 1000 objects.
Took 9495ms for the last 1000 objects.
Took 10253ms for the last 1000 objects.
Took 13477ms for the last 1000 objects.
Took 16660ms for the last 1000 objects.
Took 18195ms for the last 1000 objects.

I believe it is somehow related to memory usage when actually retrieving the object from the ROOT-file, since when commenting out the line

obj = root_file[path+"/"+obj_name] #This line is critical!

looping through batches of 1000 objects takes equally long no matter how far you "walk".

Tested with:
Python 3.6.5
rootpy 1.0.1
ROOT 6.14/04

Looking very much forward for ideas and suggestions, thanks already!

@burneyy
Copy link
Author

burneyy commented Oct 9, 2019

As discussed here, the problem seems to be that the TDirectory objects remain in the memory, as you walk through the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant