This page documents the coding style for the languages we use.
When documenting bugs, deficiencies, future tasks, or noteworthy things in the
code, we use two keywords that most editors and tools recognize: FIXME:
and
TODO:
. We use FIXME
for a known bug and TODO
for everything else. The
subsequent :
is important for tooling, such as syntax highlighters. Here are
two examples:
// FIXME: this currently fails on FreeBSD.
// FIXME: this algorithms is broken for i < 0.
A typical TODO
could be:
# TODO: refactor this code to separate mechanism from policy.
# TODO: add another argument to process user-defined tags.
- Some projects in the Tenzir organization provide
.editorconfig
files. Please respect the settings defined in these. For many editors, plugins exist to automatically apply EditorConfig files.
-
Scripts are executables (
chmod +x path/to/your-script
) and words in their names are separated using dashes (your-script
overyour_script
). -
The first line of a script should be a shebang, e.g.,
'#!/bin/sh'
or#!/usr/bin/env python3
. -
The second line is empty.
-
Starting at the third line, write a comment detailing usage instructions, and a short and concise description of the script.
-
Prefer to use POSIX sh when possible.
-
Tenzir uses ShellCheck for linting. Pull request review feedback for shell scripts is in parts based on ShellCheck.
-
We use Python 3, with no special restrictions for newer features. Specify the minimum required version in the shebang, e.g.
#!/usr/bin/env python3.6
. -
Use black for linting. Black is a heavily opinionated tool for both formatting and linting, and we found its opinion to be a good standard for us to use.
- All web-based projects in the Tenzir organization define style checkers and linters in their respective configuration files, so they are automatically applied.
CMake is the build scaffold of Tenzir.
-
Prefer targets and properties over variables.
-
Don't use global include_directories.
-
Export consumable targets to both build and install directories.
-
Assign sensible export names for your targets, the
tenzir::
namespace is implicitly prefixed.
- The cmake files are formatted with cmake-format.
Tenzir's core is written in C++. We follow a style based on STL, Google style, and CAF style guidelines.
-
Minimize vertical whitespace within functions. Use comments to separate logical code blocks.
-
The
const
keyword precedes the type, e.g.,const T&
as opposed toT const&
. -
*
and&
bind to the type, e.g.,T* arg
instead ofT *arg
. -
When declaring variables and functions, provide the storage class specifier (
extern
,static
,thread_local
,mutable
) first, followed by the declaration specifiers in order offriend
,inline
,virtual
,explicit
,constexpr
,consteval
, andconstinit
.
-
Always use
auto
to declare a variable unless you cannot initialize it immediately or if you actually want a type conversion. In the latter case, provide a comment why this conversion is necessary. -
Never use unwrapped, manual resource management such as
new
anddelete
. -
Never use
typedef
; always writeusing T = X
in favor oftypedef X T
. -
Keywords are always followed by a whitespace:
if (...)
,template <...>
,while (...)
, etc. -
Do not add whitespace when negating an expression with
!
:if (!sunny()) stay_home()
-
Opening braces belong onto the same line:
struct foo { void f() { } };
-
Use inline functions for trivial code, such as getters/setters or straight-forward logic that does not span more than 3 lines.
-
Header filenames end in
.hpp
and implementation filenames in.cpp
. -
All header files should use
#pragma once
to prevent multiple inclusion. -
Don't use
#include
when a forward declarations suffices. It can make sense to outsource forward declarations into a separate file per module. The file name should be<MODULE>/fwd.h
. -
Include order is from high-level to low-level headers, e.g.,
// iff a matching header exists #include "tenzir/matching_header.hpp" #include "tenzir/logger.hpp" #include <3rd/party.hpp> #include <memory> #include <sys/types.h>
clang-format
is configured to automatically change the include order accordingly. Includes separated by preprocessor directives need to be sorted manually.Within each section, the order should be alphabetical. Tenzir includes should always be in double quotes and relative to the source directory, whereas system-wide includes in angle brackets. See below for an example on how to structure includes in unit tests.
-
As in the standard library, the order of parameters when declaring a function is: inputs, then outputs. API coherence and symmetry trumps this rule, e.g., when the first argument of related functions model the same concept.
-
Use the order
public
,protected
,private
for functions and members in classes. -
Mark single-argument constructors as
explicit
to avoid implicit conversions; useexplicit(false)
to indicate that a non-explicit constructor is intentional. -
The order of member functions within a class is: constructors, operators, mutating members, accessors.
-
Friends first: put friend declaration immediate after opening the class.
-
Put declarations (and/or definitions) of assignment operators right after the constructors, and all other operators at the bottom of the public section.
-
Use structs for state-less classes or when the API is the struct's state.
-
Prefer types with value semantics over reference semantics.
-
Use the rule of zero or rule of five.
-
When providing a move constructor and move-assignment operator, declare them as
noexcept
. -
Use brace-initialization for member construction when possible. Only use parenthesis-initialization to avoid calling a
std::initializer_list
overload.
-
Class names, constants, and function names are lowercase with underscores.
-
Template parameter types should be written in CamelCase.
-
Types and variables should be nouns, while functions performing an action should be "command" verbs. Getter and setter functions should be nouns. We do not use an explicit
get_
orset_
prefix. Classes used to implement metaprogramming functions also should use verbs, e.g.,remove_const
. -
All library macros should start with
TENZIR_
to avoid potential clashes with external libraries. -
Names of (i) classes/structs, (ii) functions, and (iii) enums should be lower case and delimited by underscores.
-
Put non-API implementation into namespace
detail
. -
Member variables have an underscore (
_
) as suffix, unless they constitute the public interface. Getters and setters use the same member name without the suffix. -
Put static non-const variables in an anonymous namespace.
-
Name generic temporary or input variables
x
,y
, andz
. If such variables represent a collection of elements, use their plural formxs
,ys
, andzs
. -
Prefix counter variables with
num_
. -
If a function has a return value, use
result
as variable name.
-
Break constructor initializers after the comma, use two spaces for indentation, and place each initializer on its own line (unless you don't need to break at all):
my_class::my_class() : my_base_class{some_function()}, greeting_{"Hello there! This is my_class!"}, some_bool_flag_{false} { // ok } other_class::other_class() : name_{"tommy"}, buddy_{"michael"} { // ok }
-
Break function arguments after the comma for both declaration and invocation:
a_rather_long_return_type f(const std::string& x, const std::string& y) { // ... }
If that turns out intractable, break directly after the opening parenthesis:
template <typename T> black_hole_space_time_warp f( typename const T::gravitational_field_manager& manager, typename const T::antimatter_clustear& cluster) { // ... }
-
Break template parameters without indentation:
template <class T> auto identity(T x) { return x; }
-
Break trailining return types without indentation if they cannot fit on the same line:
template <class T> auto compute_upper_bound_on_compressed_data(T x) -> std::enable_if_t<std::is_integral_v<T>, T> { return detail::bound(x); }
-
Break before binary and ternary operators:
if (today_is_a_sunny_day() && it_is_not_too_hot_to_go_swimming()) { // ... }
-
Use the
typename
keyword only to access dependent types. For general template parameters, useclass
instead:template <class T> struct foo { using type = typename T::type; };
-
Use
T
for generic, unconstrained template parameters andx
for generic function arguments. Suffix both withs
for template parameter packs:template <class T, class... Ts> auto f(T x, Ts... xs) { // ... }
-
Break
using name = ...
statements always directly after=
if they do not fit in one line. -
Use one level of indentation per "open" template and place the closing
>
,>::type
or>::value
on its own line. For example:using optional_result_type = typename std::conditional< std::is_same<result_type, void>::value, bool, optional<result_type> >::type;
-
When dealing with "ordinary" templates, use indentation based on the position of the last opening
<
:using type = quite_a_long_template_which_needs_a_break<std::string, double>;
-
When adding new type traits, also provide
*_t
and/or*_v
helpers:template <class T> using my_trait_t = typename my_trait<T>::type; template <class T> constexpr auto my_trait_v = my_trait<T>::value;
-
Available log levels are
ERROR
,WARN
,INFO
,VERBOSE
,DEBUG
andTRACE
. -
Messages can be sent by using the
TENZIR_<level>
macros. -
Try to restrict usage of the
TENZIR_INFO
message type to the main actors. Info is the chattiest level that most users will see, so it should require no or only little understanding of Tenzir's system architecture for the reader to understand. -
Use the
TENZIR_TRACE_SCOPE
macro to elicit an additional message at the exit of the current scope. The trace level can be used to create a trace of the call stack with fine grained control over its depth. Since the size of trace messages can quickly go out of hand, omit trace messages from helper functions and generic algorithm implementations.
-
Doxygen comments start with
///
. -
Use Markdown instead of Doxygen formatters.
-
Use
@cmd
rather than\cmd
. -
Document pre- and post-conditions with
@pre
and@post
(where appropriate). -
Reference other parameters with emphasis:
/// @param x A number between 0 and 1. /// @param y Scales *x* by a constant factor.
-
Use
@tparam
to document template parameters. -
For simple getters or obvious functions returning a value, use a one-line
@returns
statement:/// @returns The answer. int f();
-
Use
//
or/*
and*/
to define basic comments that should not be swallowed by Doxygen.
When integrating 3rd-party code into the code base, use the following scaffold:
// _ _____ __________
// | | / / _ | / __/_ __/ Visibility
// | |/ / __ |_\ \ / / Across
// |___/_/ |_/___/ /_/ Space and Time
//
// SPDX-FileCopyrightText: (c) 2022 The Tenzir Contributors
// SPDX-License-Identifier: BSD-3-Clause
//
// This file comes from a 3rd party and has been adapted to fit into the Tenzir
// code base. Details about the original file:
//
// - Repository: https://github.com/Microsoft/GSL
// - Commit: d6b26b367b294aca43ff2d28c50293886ad1d5d4
// - Path: GSL/include/gsl/gsl_byte
// - Author: Microsoft
// - Copyright: (c) 2015 Microsoft Corporation. All rights reserved.
// - License: MIT
(code here)
-
Every new feature must come with unit tests.
-
The filename and path should mirror the component under test. For example, the component
tenzir/detail/feature.hpp
should have a test file calledtest/detail/feature.cpp
. -
The include order in unit tests resembles the order for standard headers, except that unit test includes and the suite definition comes at the top.
-
Make judicious use of fixtures for prepping your test environment.
-
The snippet below illustrates a simple example for a new component
tenzir/foo.hpp
that would go intotest/foo.cpp
.// _ _____ __________ // | | / / _ | / __/_ __/ Visibility // | |/ / __ |_\ \ / / Across // |___/_/ |_/___/ /_/ Space and Time // // SPDX-FileCopyrightText: (c) 2022 The Tenzir Contributors // SPDX-License-Identifier: BSD-3-Clause #define SUITE foo #include "tenzir/foo.hpp" // Unit under test #include "test.hpp" // Unit test framework and scaffolding #include <iostream> // standard library includes #include <caf/...> // CAF includes #include "tenzir/..." // Tenzir includes using namespace tenzir; namespace { struct fixture { fixture() { // Setup context = 42; } ~fixture() { // Teardown context = 0; } int context; }; } // namespace <anonymous> FIXTURE_SCOPE(foo_tests, fixture) TEST(construction) { MESSAGE("default construction"); foo x; MESSAGE("assignment"); x = 42; CHECK_EQUAL(x, context); } FIXTURE_SCOPE_END()
We use GitHub Actions to build and test each commit. Merging a pull request requires that all checks pass for the latest commit in the branch. GitHub displays the status of the individual checks in the pull request.
The GitHub Actions workflow Analysis contains a Code Coverage job that runs unit tests for libtenzir and bundled plugins, and integration tests for Tenzir with bundled plugins to create a detailed line coverage report. The CI creates and uploads reports as an artifact in the Analysis workflow as part of every pull request and for every merge to master.
In addition to the local report, the workflow uploads the coverage report to Codecov, which offers a visual interface for seeing coverage changes introduced by code changes:
Each block represents a single file in the project. The size and color of each block is represented by the number of statements and the coverage, respectively.
Codecov offers also a sunburst and icicle graph, visualizing the same data with a different approach.
To generate a coverage report locally, create a new Debug build of Tenzir with
the CMake option -D TENZIR_ENABLE_CODE_COVERAGE=ON
and run the ccov
build
target. This creates a coverage report in <path/to/build-dir>/ccov
.