-
Notifications
You must be signed in to change notification settings - Fork 128
The C++ Abstraction Layer
Rahul Iyer edited this page Jul 1, 2015
·
10 revisions
The MADlib C++ Abstraction Layer provides a means for writing platform-independent user-defined functions in C++. It provides a complete abstraction, performs all necessary type checking and embraces the Eigen C++ Library for providing an intuitive and clean interface to high-performance linear-algebra functions (LAPACK).
AnyValue student_t_cdf(AbstractDBInterface &db, AnyValue args) {
AnyValue::iterator arg(args);
// Arguments from SQL call
const int64_t nu = *arg++;
const double t = *arg;
/* We want to ensure nu > 0 */
if (nu <= 0)
throw std::domain_error("Student-t distribution undefined for "
"degree of freedom <= 0");
return studentT_cdf(nu, t);
}
- Performs type checking of function argument
- Lossless conversion of pass-by-value is done implicitly (e.g., from uint32_t to uint64_t)
- Implicit lossy conversion will throw an exception (e.g., from uint64_t to uint32_t)
- Supports pass-by-reference whenever possible (performance!). However, if the user code asks for a mutable object but the database prohibits direct modification, a copy is automatically created.
- Well-behaved/non-hacky user code cannot accidentally corrupt database data
- The only supported means for user code to communicate with the DBMS backend is through the interface provided by
AbstractDBInterface
/AbstractAllocator
(truly platform-independent!). - Integration of Armadillo for linear algebra operations (Armadillo itself is a C++ wrapper for LAPACK). This allows for intuitive math notation in C++ code.
- Overloads the global throw/nothrow variants of
operator new
andoperator delete
to usepalloc
/pfree
- All memory allocation is funneled through the
PGAllocator
class - All callbacks into the backend (in particular
palloc
/pfree
) occur withinPG_TRY
/PG_CATCH
blocks. This ensures that any postgres exception raised byereport
will return back to the calling C++ function. There we throw a C++ exception, which is caught at just above the C/C++ boundary. From there the PostgreSQL exception is rethrown. This procedure ensures that the C++ stack is always unwound properly (otherwise the longjump done byereport
would lead to behavior that is undefined by the C++ standard). - Callbacks into the DBMS backend where no exceptions are permitted (
operator new (std::nothrow)
) deactivate interrupt processing for the duration of their callback (interrupts are, of course, still properly recorded while processing is disabled). This is to ensure that no database signals get lost -- otherwise, e.g., it would be indistinguishable for the caller of a failedoperator new (std::nothrow)
whether the NULL pointer is due to a SIGINT or because of full memory. The rationale here is that by disabling interrupts, the SIGINT would get preserved and be dealt with at an appropriate later point.