Feature Request: Dealing with "HUGE" single SVN Revisions #3

ThomasMannIT · 2013-10-02T12:12:13Z

Hello, your SVN River is working great.

While testing it i run into following situtation:

In an to be indexed SVN i have a SVN Revision Number in which a hero commited 6GB of data (not binary, but plaintext SQL dumps -.-) )

Reducing the bulk_size option down to 1 doesn't help, as it is still only one revision to be indexed. And 6GB data to be indexed leads to OutOfHeap Exceptions even on an 8GB machine with 7GB heap space.

At the moment i avoided the problem by letting the river index till x-1 revision and defining afterwards a start_revision with x+1.

But this workaround doesnt feel right.

Maybe some new "river options" could help:

Like:

a) max_bulk_size_in_mb
b) File-Extension Filters
c) Folder Filters
d) Revision Filters

plombard · 2013-10-03T11:23:04Z

Hi, and thanks for the feedback.
It should be fairly simple to implement some filters and max_size, I'll get to it as soon as I can.
On the other hand, I'm not fond of filtering entire revisions or folders. They should be there, so the history of the repository can still be browsed.
As the resulting index is far from sufficient to browse the repositories easily, it leaves a huge amount of functional implementations to the front-end (if you want something like ViewSVN). So I think having a trace of every revision/change is mandatory. The content, however, isn't. So we could replace the 6Gb text by just a warning message.

And you are absolutely right, Heap consumption is a concern, as I foolishly load the entire revision (content included) in memory. Maybe I'll try to index the file content separately from the metadatas, I don't know yet.

ghost assigned plombard Oct 3, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Dealing with "HUGE" single SVN Revisions #3

Feature Request: Dealing with "HUGE" single SVN Revisions #3

ThomasMannIT commented Oct 2, 2013

plombard commented Oct 3, 2013

Feature Request: Dealing with "HUGE" single SVN Revisions #3

Feature Request: Dealing with "HUGE" single SVN Revisions #3

Comments

ThomasMannIT commented Oct 2, 2013

plombard commented Oct 3, 2013