Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URE performance #27

Open
linas opened this issue Nov 19, 2016 · 11 comments
Open

URE performance #27

linas opened this issue Nov 19, 2016 · 11 comments

Comments

@linas
Copy link
Member

linas commented Nov 19, 2016

The current URE design has a performance bottleneck.that may be constraining system performance. This is a feature-enhancement request to maybe fix that. It's low-priority; I wanted to note it here so that it doesn't get lost.

To provide isolation for the application of rules to the focus set, the current URE creates a brand-new, stand-alone atomspace, copies the focus-set into it, and copies the rules into it. This is done every time the forward chainer is run. (and also the backward chainer, I presume)

The problem is that creating atoms and adding them to the atomspace is slow: currently, one can add about 100K atoms/second to the atomspace. The current PLN ruleset has more than 2K atoms, thus requires approx 20 milliseconds to set up the atomspace. This limits PLN performance to no more than 50 invocations of PLN per second, and probably less.

The Relex2logic ruleset currently consists of about 1.5K atoms: 750 nodes and 800 links approx.

A single parsed sentence, from relex, has 500 atoms in it, typical. Since these are the focus set, they must also be copied into the atomspace - I assume r2l generates a similar number of atoms, that have to be passed into PLN. Thus, atomspace creation in the URE limits performance to no more than:

-- 5+115 = 20 millisecs for R2L
-- 20+5 = 25 millisecs for PLN
-- 20+25 = 55 millisecs for single sentence and one run of R2L, one run of PLN.

This is a hefty overhead. I don't think it's impacting any current work, but it will eventually cause problems...

@linas
Copy link
Member Author

linas commented Nov 19, 2016

An obvious "quick fix" would be to alter the pattern matcher callbacks to simply not consider any atoms that are not in a focus-set. This is quite easy to do. It's somewhat similar to the attention-based variant of the pattern matcher, which avoids looking at atoms with low AV.

A focus-set variant of the matcher could provide the determinism that the current chainers have, and surely run faster than the current chainers. It would also avoid some of the indeterminism associated with the AV subsystem.

@ngeiswei
Copy link
Member

The forward chainer is rather messy as compared to the new backward chainer. I intend as soon as I start running the large experiments on the BC to re-organize the FC code to be cleaner and ultimately faster.

@linas
Copy link
Member Author

linas commented Nov 19, 2016

I wasn't worried about the mess, per-se, more about the copy-in and copy-out of the focus-set. This affects the backwards-chainer too, not just the forwards-chainer. (right? because BC also does this copy-in?)

I deleted my other comments and moved them to issue #26..

@ngeiswei
Copy link
Member

The BC does no copying. There is little in common between the BC and FC. The BC has been re-written from scratch and, although not being optimized at a low level, is more thoughtfully designed. I'll soon revisit and simplify the FC, so I'd advice not to spend too much on it yet.

@ivan-ushakov
Copy link

ivan-ushakov commented Apr 7, 2018

Am I right that problem still here? I mean DefaultPatternMatchCB::eval_term() still creates new atomspace each time and profiler shows that it has huge impact on performance.

@ngeiswei
Copy link
Member

ngeiswei commented Apr 9, 2018

Thanks @ivan-ushakov for your interest, however as I said the FC is soon gonna be redesigned. It's gonna be merged to the BC to form a mixed chainer, and its main routines exposed so that the user can easily build his/her own chainer. So unless you or other users have a strong immediate incentive to have an optimized FC I wouldn't bother.

@ngeiswei
Copy link
Member

ngeiswei commented Apr 9, 2018

@ivan-ushakov that is said, any profiling results you have would still be of interest, even if it's about FC, so feel free to paste them here. If you happen to do some profiling on the BC that would be of even more interest. Thanks!

@linas
Copy link
Member Author

linas commented Apr 9, 2018

There is supposed to be a pool of empty atom-spaces, ready to be used, and so grabbing one should be very fast ... is this not the case? Maybe there is an overhead for the first one or two being created, but after that, they are supposed to be recycled ... Curtis implemented this, exactly to avoid the heavy overhead of creating them over and over.

@linas
Copy link
Member Author

linas commented Apr 9, 2018

(and, no, we probably do not have a unit test to make sure that this pool is working as designed. Writing such a unit test should be easy -- first, measure the time to create 100 normal atomspaces, then measure the time to create 100 temp atomspaces -- the second time should be at least 10x faster or maybe 1000x faster than the first)

@ngeiswei
Copy link
Member

ngeiswei commented Apr 9, 2018

I misread @ivan-ushakov message, indeed it's not about FC profiling per se. @ivan-ushakov, could you be more specific? How and what did you profile?

@ivan-ushakov
Copy link

I'm talking about method DefaultPatternMatchCB::eval_term() and this comment inside it:

Evaluation of the link requires working with an atomspace of some sort, so that the atoms can be communicated to scheme or python for the actual evaluation. We don't want to put the proposed grounding into the "real" atomspace, because the grounding might be insane. So we put it here. This is probably not very efficient, but will do for now...

Also about caching. I found that key CACHED_IMPLICATOR not enabled by default. So am I right that we don't have this cache if we do usual build?

@linas linas transferred this issue from opencog/atomspace Jul 26, 2019
ngeiswei added a commit to ngeiswei/ure that referenced this issue Nov 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants