Skip to content

IntrusivePtr Overview

Christian Kreibich edited this page Sep 10, 2020 · 8 revisions

IntrusivePtr is a type of smart pointer that Zeek uses more comprehensively starting with release 3.2. Associated benefits/goals:

  • Decrease amount of reference counting bugs that we've come to expect from the current "manual" reference counting approach (these produce memory leaks or crashes that can be hard to hunt down)
  • Help developers since the task of understanding reference-ownership semantics becomes implicit in the API rather than having to either read comments or implementation to figure it out (a task subject to be repeated frequently and also prone to misremembering).
  • Eliminate memory leaks that may occur during the stack unwind caused by runtime interpreter exceptions being thrown after encountering a Zeek scripting error (e.g. accessing uninitialized field/variable).

(Note that examples below don't use the new zeek:: namespace scoping everywhere you might need/want to in your own code).

Usage Examples

IntrusivePtr may be used with any type of object that already stores its own reference count and provides Ref() and Unref() functions, but Zeek most often uses it for BroObj and its derivatives that are a part of the script interpreter and its execution like Stmt, Expr and Val.

Creating New IntrusivePtr Objects

We're already familiar with the old way of using new to allocate a new object:

auto myval = new StringVal("hello world");
// ... and eventually decrement our reference when done with it
Unref(myval)

It's helpful to transition to creating an IntrusivePtr right from the start:

auto myval = make_intrusive<StringVal>("hello world");
// The destructor will automatically `Unref()` whenever `myval` goes out of scope

The make_intrusive function will forward all its arguments to the constructor you would have otherwise used for new allocation. There are also several operator overloads for IntrusivePtr that make it natural to use like a regular pointer: *, ->, !, etc.

Note that a nullptr may also be implicitly converted to an IntrusivePtr for convenience.

Passing IntrusivePtr Objects

Here's an example comparing RecordVal::Assign() APIs:

// Old API:
void Assign(int field, Val* new_val);
// New API:
void Assign(int field, IntrusivePtr<Val> new_val);

It's easy to see the ambiguity in the old API: does the caller need to increment the reference count or not?

The new API implicitly claims ownership of a reference no matter what. The only two choices the caller has are whether to copy or move their object during the call.

auto addr_val = make_intrusive<AddrVal(src_addr);
auto rv = make_intrusive<RecordVal>(conn_id);

// Copy constructor increments ref-count automatically
rv->Assign(0, addr_val);

// Move constructor takes ownership without ref-count increment,
// but subsequent accesses to `addr_val` in this scope would be a bug
rv->Assign(2, std::move(addr_val));

So use std::move() if you no longer need to access the object in question at current scope, else just let the copy constructor automatically help you manage the reference counting. Ultimately, going the way of std::move() is only a minor performance optimization and not strictly necessary from a correctness standpoint.

For APIs that don't need to engage in reference counting (i.e. they don't ever store the pointer or mutate pointed-to object), consider using a const-reference parameter:

void foo(const RecordVal& rv);
void bar(const IntrusivePtr<RecordVal>&);

Either of those are bit clearer than passing a raw RecordVal* due to the ownership ambiguity mentioned previously.

Returning IntrusivePtr Objects

There are two main scenarios:

  • Returning an IntrusivePtr value simply means the caller is taking ownership of a +1 reference count which the IntrusivePtr will automatically take care of. If it goes out of scope, the destructor calls Unref(), or if you then std::move() it to some other API that consumes IntrusivePtr, you've effectively passed ownership of the object lifetime.

  • Returning an const IntrusivePtr& value may be ok for some "getter" type functions (those that are simply returning reference to a data member that's already an IntrusivePtr). It's again a simple case that doesn't need much thought for a caller to "handle" correctly: they can access the object like usual and if they need to pass it around or store it, the other APIs taking an IntrusivePtr will automatically increment the reference count via the copy constructor.

Calling Functions / Queuing Events

These operations are basically the same as before, except rather than the argument list storing Val*, they pass along IntrusivePtr. There are also some variadic templates for convenience of forwarding individual arguments to the zeek::Args constructor (which is an alias for std::vector<IntrusivePtr<Val>>).

// Example of calling script-layer functions
auto func = global_scope()->Lookup("foo")->ID_Val()->AsFunc();
func->Call(make_intrusive<StringVal>("1st arg"), make_intrusive<StringVal>("2nd arg"));
// Example of enqueuing script-layer events
mgr.Enqueue(my_event, make_intrusive<StringVal>("1st arg"), make_intrusive<StringVal>("2nd arg"));

Transitionary Usage Examples

In a perfect world where every API is already moved to use IntrusivePtr, the above guidelines are all there is to it. However, part of why Zeek uses IntrusivePtr rather than another type of smart pointer, is that it allows for an incremental transition and this creates an in-between state of APIs that does add extra interfacing complexity to think about at the boundary of areas that either haven't been transitioned to use IntrusivePtr or that we don't expect to transition at all (e.g. possibly for generated code or at very-low-level code we don't expect to be frequently travelled).

Creating IntrusivePtr Objects from Raw Pointers

There is a special constructor to help:

Val* foo()
    { return new StringVal("hi"); }

void corge(IntrusivePtr<Val> v)
    { /* Does something with "v" */ }

Val* val = foo();
IntrusivePtr v1{AdoptRef{}, val};
corge(std::move(v1));

Here, foo() returned a raw pointer (with +1 ref-count) for us to manage and we create an IntrusivePtr to adopt ownership of that reference (no need to explicitly Unref(val) later on in this example). You can go on to use v1 in any of the typical ways described earlier.

Val* bar()
    { return some_preexisting_val; }

void corge(IntrusivePtr<Val> v)
    { /* Does something with "v" */ }

Val* val = bar();
IntrusivePtr v2{NewRef{}, val};
corge(std::move(v2));

Here, bar() returned a raw pointer (with no ref-count increment for us to manage), but we can create an IntrusivePtr from it when helpful (i.e. to pass to other APIs that consume IntrusivePtr). Again, the NewRef{} tag here took care of calling Ref() in the constructor to increment the ref-count and an Unref() will automatically occur whenever that v2 object is destructed (or in this case, the std::move leaves that up to the corge parameter/implementation to take care of).

Passing Raw Pointers

If we pretend the old RecordVal::Assign() API is all we have and hasn't been transitioned to take an IntrusivePtr argument:

void Assign(int field, Val* new_val);

That leaves open the question of how to pass in IntrusivePtr objects we obtain from other APIs that already adapted to use IntrusivePtr. An example, similar to the previous one, now looks like:

auto addr_val = make_intrusive<AddrVal(src_addr);
auto rv = make_intrusive<RecordVal>(conn_id);
rv->Assign(0, addr_val->Ref());
rv->Assign(2, addr_val.release());

The first assignment shows we can just pass in the necessary +1 reference in the usual way by calling Ref(), then the second assignment shows we can also release() the reference implicit in the IntrusivePtr object itself and pass that along. After releasing, the IntrusivePtr object essentially becomes a nullptr, not to be subsequently accessed.

For functions that take a raw pointer argument and either don't participate in ref-counting or do the necessary Ref() call themselves, simply use IntrusivePtr::get() to access the raw pointer without modifying the ref-count.

Convenience Aliases

To help with code read-ability and type-ability, convenience aliases are generally provided for most IntrusivePtr<T>. e.g. prefer to use xPtr instead of IntrusivePtr<x>, such as ValPtr, TypePtr, RecordValPtr, RecordTypePtr, AttrPtr, etc.

Clone this wiki locally