Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prometheus-cpp, drop legacy telemetry API #402

Closed
wants to merge 8 commits into from

Conversation

Neverlord
Copy link
Member

@Neverlord Neverlord commented May 3, 2024

Migrate to the prometheus-cpp API for compatibility with Zeek.

Closes #400.

@Neverlord Neverlord force-pushed the topic/neverlord/prometheus-cpp branch 3 times, most recently from 2697c3a to efad2b6 Compare May 6, 2024 14:56
@Neverlord
Copy link
Member Author

@timwoj, @ckreibich: aside from getting CI green:

I sat down with prometheus-cpp to figure out how to best integrate the build. The way it has been integrated in Zeek has several drawbacks that I'd like to address. On the one hand, Zeek still relies on manipulating global CMake state and global variables. Just reverting the CMake build flags variable did the trick for stopping the embedded civetweb from printing debug information to console when building Zeek/Broker as a debug version. However, that only takes care of a fraction of things that can affect the prometheus-cpp build and it's very fragile. On the other hand, prometheus-cpp also isn't built for embedding it as a sub-module in CMake. It manipulates global state as well and using the defined targets without some CMake hackery will result in Zeek/Broker installing prometheus-cpp alongside our own targets. That something that we must not do, since it can easily break system installations. Several distributions have prometheus-cpp packages.

Finally, package maintainers should be able to tell Zeek/Broker to use the system dependency for prometheus-cpp if there's a package available. Just using a prometheus-cpp packages is the cleanest solution. However, so far Zeek/Broker are standalone and I think there are no plans to rely on a package manager (like Conan).

Long story short, I sat down last week to experiment with a couple ways to integrate prometheus-cpp such that it's also not just a one-off solution. I have written a small ZeekBundle.cmake script that I would propose for integrating external libraries such as prometheus-cpp in Zeek and Broker.

In Broker, this is how we currently call it:

ZeekBundle_Add(
  NAME prometheus-cpp
  FETCH
    GIT_REPOSITORY https://github.com/jupp0r/prometheus-cpp.git
    GIT_TAG        v1.2.4
  CONFIGURE
    ENABLE_PUSH=OFF
    ENABLE_TESTING=OFF
    CMAKE_POSITION_INDEPENDENT_CODE=ON)

It builds on top of FetchContent. Since we have a Git sub-module for it in Zeek, the syntax there would be:

ZeekBundle_Add(
  NAME prometheus-cpp
  FETCH
    SOURCE_DIR auxil/prometheus-cpp
  CONFIGURE
    ENABLE_PUSH=OFF
    ENABLE_TESTING=OFF
    CMAKE_POSITION_INDEPENDENT_CODE=ON)

When using this method, CMake will build the bundled library at configure time and independently. In other words, CMake will launch a separate cmake process, build the thing and install it into a custom prefix in the build tree. From there, we pick it up as we would pick up any other dependency via find_package. We will build the dependency as a static library so that we won't have a run-time dependency.

In addition, you can use the usual CMake method to have it pick up a library version from the system instead by setting the CMake variable prometheus-cpp_ROOT.

I'm happy to discuss this approach with you guys and iterate on it if you find some shortcomings (unless you have better ideas, of course). 🙂

@Neverlord Neverlord force-pushed the topic/neverlord/prometheus-cpp branch from efad2b6 to 8a35d58 Compare May 6, 2024 15:33
@timwoj
Copy link
Member

timwoj commented May 6, 2024

I definitely like the idea, especially if we can use it for other submodules as well. I've never really liked how we include various subprojects in Zeek via CMake, and making how they're handled done in a common way would be a big improvement. I'm torn on the idea of things being built during configure time since it will make that drag on a bit longer, but it's probably not a big deal.

without some CMake hackery will result in Zeek/Broker installing prometheus-cpp alongside our own targets

Ideally all of our internal builds are linked statically, which the prometheus-cpp libraries should already be. I double-checked with the existing setup. Only static libraries are being generated, and they're not getting installed as part of make install.

@Neverlord Neverlord force-pushed the topic/neverlord/prometheus-cpp branch 2 times, most recently from 033d185 to 535a483 Compare May 9, 2024 13:03
@Neverlord Neverlord force-pushed the topic/neverlord/prometheus-cpp branch 3 times, most recently from 213f026 to 0f7be9f Compare May 12, 2024 13:06
@Neverlord
Copy link
Member Author

All right, I've sat down with CMake some more to get the ZeekBundle work on Windows (cannot mix debug and release libraries) and with multi-configuration generators. I've tested it on MacOS, Linux (Fedora) and Windows (using ninja as we do on CI as well as using the default CMake generator that will emit a VisualStudio project with multi-configuration mode). Seems to work quite nicely now.

@timwoj feel free to play around with it and to integrate it into your telemetry rework branch. You can pass a shared pointer to the Prometheus registry to the broker::endpoint to have it feed into the same registry as Zeek.

Ideally all of our internal builds are linked statically, which the prometheus-cpp libraries should already be. I double-checked with the existing setup. Only static libraries are being generated, and they're not getting installed as part of make install.

It's simpler for binary targets like Zeek. All you need to do is passing EXCLUDE_FROM_ALL to add_subdirectory to suppress the install targets. However, Broker exports library targets for installation. Just pulling in prometheus-cpp via add_subdirectory will have prometheus-cpp::core point to an actual library target that isn't marked for export. CMake won't like that and will run into this error (CMake 3.29.1 on macOS):

CMake Error in auxil/broker/CMakeLists.txt:
  export called with target "broker-prometheus-cpp" which requires target
  "core" that is not in any export set.


CMake Error in auxil/broker/CMakeLists.txt:
  export called with target "broker-prometheus-cpp" which requires target
  "pull" that is not in any export set.

That's what I meant with being cautious and I can't get Zeek to build when using your topic/timw/prometheus-cpp-3 branch in Zeek with the prometheus-cpp version of Broker. There are probably some ways around this, but it's probably going to be hacky (and fragile). At least to my knowledge there's no clean way to tell CMake to "ignore" a transitive dependency in an export set. Happy to be taught otherwise, though. In any case, I'd rather find a solution that works with CMake instead of fighting against it and I think isolating the build of 3rd-party dependency is a big plus as well.

This is the patch I've used on your branch to use the ZeekBundle approach and get it building with prometheus-cpp. Of course, ZeekBundle.cmake would need to move to our cmake sub-module if we decide to use it.

$ git diff
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 60de8b05f..e1a95a2a8 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -892,6 +892,21 @@ if (MSVC)
 endif ()
 set(zeekdeps ${zeekdeps} paraglob)

+
+include(auxil/broker/ZeekBundle.cmake)
+ZeekBundle_Add(
+  NAME prometheus-cpp
+  FETCH
+    SOURCE_DIR "${PROJECT_SOURCE_DIR}/auxil/prometheus-cpp"
+  CONFIGURE
+    ENABLE_COMPRESSION=OFF
+    ENABLE_PUSH=OFF)
+target_link_libraries(
+  zeek_internal
+  INTERFACE
+    $<BUILD_INTERFACE:prometheus-cpp::core>
+    $<BUILD_INTERFACE:prometheus-cpp::pull>)
+
 # Note: Broker gets some special attention in ZeekConfig.cmake.in.
 if (Broker_ROOT)
     find_package(Broker REQUIRED CONFIG)
@@ -1124,7 +1139,6 @@ include(GetArchitecture)
 include(FindCAres)
 include(FindKqueue)

-include(FindPrometheusCpp)
 include_directories(BEFORE "auxil/out_ptr/include")

 if ((OPENSSL_VERSION VERSION_EQUAL "1.1.0") OR (OPENSSL_VERSION VERSION_GREATER "1.1.0"))
diff --git a/auxil/broker b/auxil/broker
index 48660b74a..57a953683 160000
--- a/auxil/broker
+++ b/auxil/broker
@@ -1 +1 @@
-Subproject commit 48660b74ae2e5b5babd6f0dc38f3c85e24434d77
+Subproject commit 57a953683c3569223166c8023abc7e41284a2969

ZeekBundle.cmake Outdated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FetchContent is great, but unfortunately not the way things are done in Zeek. Instead we want a checkout which contain all required sources after a recursive clone, so prometheus-cpp should instead use a submodule.

Even if we were using FetchContent IMO the whole approach of ZeekBundle feels like trying it to hard to me. We probably wouldn't want to build a custom framework for building at configure time (these already exist), but should instead just use FetchContent_MakeAvailable.

I would be happy to advocate for FetchContent again (though I know that people have strong opinions around this) since it removes a whole class of errors ("did you run git submodule update --init --recursive before trying to build?"), but maybe it'd be better to leave this out of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are some misconceptions what this actually does. Ultimately, there are basically two (common) ways to integrate a library. Either building it alongside our own targets using add_subdirectory or consume it as an external dependency via find_package.

The add_subdirectory integration is originally meant to build integrated parts of a bigger thing, not as a means to consume 3rd-party dependencies. Using it in this way means to pull in CMake code into your project without having control over it. In this instance, I ran into the issues I've tried to explain in a previous post here.

This script builds a 3rd-party library separately, so that we can use find_package to consume the library as a proper external dependency. We also respect the <name>_ROOT CMake variable to have users pick up an existing installation in the system for a 3rd-party dependency <name>. This allows package maintainers to "un-bundle" libraries from Zeek that can/should be packaged on their own.

As for how to fetch the library: in Zeek, we already have the submodule and we can point the script to the directory location on disk:

+ZeekBundle_Add(
+  NAME prometheus-cpp
+  FETCH
+    SOURCE_DIR "${PROJECT_SOURCE_DIR}/auxil/prometheus-cpp"
+  CONFIGURE
+    ENABLE_COMPRESSION=OFF
+    ENABLE_PUSH=OFF)

If we would have the same submodule in Broker again, we would pull the repository twice for each checkout of the Zeek repository, which is unnecessary since we will only use one copy. Usually, Broker is integrated as submodule in Zeek. Hence, I've shielded the ZeekBundle call like this in a prior commit:

# Bundle prometheus-cpp if not building as subproject. Otherwise, the parent
# project is responsible for adding the prometheus-cpp targets.
if (NOT broker_is_subproject)
  ZeekBundle_Add(
    NAME prometheus-cpp
    FETCH
      GIT_REPOSITORY https://github.com/jupp0r/prometheus-cpp.git
      GIT_TAG        v1.2.4
    CONFIGURE
      ENABLE_COMPRESSION=OFF
      ENABLE_PUSH=OFF)
elseif (NOT TARGET prometheus-cpp::core OR NOT TARGET prometheus-cpp::pull)
  message(FATAL_ERROR "building Broker as sub-project requires prometheus-cpp targets")
endif ()

This means we will have FetchContent go get the sources externally only if you build Broker on its own. That's essentially Broker CI and developers working on Broker locally.

In my humble opinion, I would much prefer it to just state our external dependencies and simply use find_package. The project decision is to try and minimize build dependencies, though. It's always going to be a tradeoff between how easy it is to set something up versus how complex the build scaffolding becomes.

This script tries to provide a re-usable way to ship 3rd-party dependencies alongside our own sources while still letting us consume them as proper external dependencies via find_package. If we ever decide on using a package manager like vcpkg or conan, this script becomes obsolete immediately.

So do I like this script? No, I would rather not have it in the code base. But given the project constraints, I think it provides a reasonable tradeoff by letting us consume dependencies as actual CMake dependencies while still shipping code of required 3rd-party libraries alongside Zeek. At least to the extend of my knowledge. I am very happy to learn better ways.

Comment on lines +228 to +239
# Bundle all prometheus-cpp libraries that we need as single interface library.
add_library(broker-prometheus-cpp INTERFACE)
target_link_libraries(
broker-prometheus-cpp
INTERFACE
$<BUILD_INTERFACE:prometheus-cpp::core>
$<BUILD_INTERFACE:prometheus-cpp::pull>)
# Prometheus-cpp sets C++ 11 via CMake property, but we need C++ 17.
target_compile_features(broker-prometheus-cpp INTERFACE cxx_std_17)
# Must be exported to keep CMake happy, even though we only need it internally.
install(TARGETS broker-prometheus-cpp
EXPORT BrokerTargets
DESTINATION ${CMAKE_INSTALL_LIBDIR})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we directly depend on the prometheus libraries in consumers, is this to set up CXX_STANDARD? I wonder whether that concern wouldn't just go away if we directly consumed the prometheus libraries and built prometheus as a proper dependency (instead of the custom FetchContent stack). Another concern with this catch-all library is that consumers might pull in more than they actually need.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, simply dependending on the prometheus targets directly would be preferable. I've added this as a workaround because Broker won't compile otherwise.

The prometheus-cpp targets set the property cxx_std_11. On our end, we use this script that will set CMAKE_CXX_FLAGS (I know...).

So what happens is that our targets will have the cxx flags applied to them. Then, CMake sees that the target "inherits" the cxx_std_11 property and kindly adds something like -std=c++11. The end result is that CMake will pass -std=c++17 and -std=c++11 to the compiler. The compilers will use whatever comes last, so it tries to build Broker in C++11 mode, which will of course fail.

This extra target exists to pull in the prometheus targets but override the cxx_std_11 property with cxx_std_17 so that CMake will not select an old C++ version.

Instead of using this interface library, setting the cxx_std_17 property on the three targets that pull in the prometheus library would probably work as well. I'm also thinking of kicking out the RequireCXX17.cmake script, but that should probably be a followup.

libbroker/broker/internal/clone_actor.cc Outdated Show resolved Hide resolved
@@ -132,7 +132,8 @@ core_actor_state::metrics_t::metrics_t(caf::actor_system& sys) {
message_metric_sets[5].assign(proc.pong, buf.pong);
}

core_actor_state::core_actor_state(caf::event_based_actor* self,
core_actor_state::core_actor_state(caf::event_based_actor* self, //

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
core_actor_state::core_actor_state(caf::event_based_actor* self, //
core_actor_state::core_actor_state(caf::event_based_actor* self,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use comments like these in a couple places to influence clang-format behavior, i.e., to force a line break here and have it formatted consistently like this:

core_actor_state::core_actor_state(caf::event_based_actor* self, //
                                   prometheus_registry_ptr reg,
                                   endpoint_id this_peer,
                                   filter_type initial_filter,
                                   endpoint::clock* clock,
                                   const domain_options* adaptation,
                                   connector_ptr conn)
  : self(self),

Otherwise, clang-format will try to put things into fewer lines and break like this, even it makes things harder to read:

core_actor_state::core_actor_state(
  caf::event_based_actor* self, prometheus_registry_ptr reg,
  endpoint_id this_peer, filter_type initial_filter, endpoint::clock* clock,
  const domain_options* adaptation, connector_ptr conn)
  : self(self),

There are probably ways to consistently format it like the former by tweaking clang-format penalty parameters or maybe there is an actual setting to prefer consistent formatting over saving source lines. Didn't seem worth the time, though, since empty comments can fix the few places where clang-format makes weird decisions.

libbroker/broker/internal/core_actor.cc Outdated Show resolved Hide resolved
std::string&& store_name, caf::actor&& core,
consumer_resource<command_message> in_res,
producer_resource<command_message> out_res) {
BROKER_ASSERT(clock != nullptr);
this->registry = reg;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this since there's no name to disambiguate. Also let's move.

Suggested change
this->registry = reg;
registry = std::move(reg);

Additionally we should probably create a default value if the caller passed an empty ptr; that way we can safely deref registry elsewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider that the entire block is prefixed with this->? Having this one member assignment not use this would break the symmetry and the member variables would no longer align vertically, which I think improves the readability here.

libbroker/broker/internal/store_actor.hh Outdated Show resolved Hide resolved
libbroker/broker/internal/store_actor.hh Outdated Show resolved Hide resolved
libbroker/broker/internal/store_actor.hh Outdated Show resolved Hide resolved
Comment on lines +451 to +458
// Spin up a Prometheus exposer if configured.
if (auto port = caf::get_as<broker::port>(cfg, "broker.metrics.port")) {
auto ptask = std::make_unique<prometheus_http_task>(sys);
auto addr = caf::get_or(cfg, "broker.metrics.address", std::string{});
if (auto actual_port =
ptask->start(port->number(), native(core_),
addr.empty() ? nullptr : addr.c_str())) {
BROKER_INFO("expose metrics on port" << *actual_port);
telemetry_exporter_ = facade(ptask->telemetry_exporter());
background_tasks_.emplace_back(std::move(ptask));
} else {
BROKER_ERROR("failed to expose metrics:" << actual_port.error());
}
} else {
using exporter_t = internal::metric_exporter_actor;
auto params = internal::metric_exporter_params::from(cfg);
auto hdl = sys.spawn<exporter_t>(native(core_), std::move(params));
telemetry_exporter_ = facade(hdl);
auto str = caf::get_or(cfg, "broker.metrics.address", ""s);
if (!str.empty())
str += ':';
str += std::to_string(port->number());
BROKER_INFO("expose metrics on" << str);
exposer_ = std::make_unique<prometheus::Exposer>(str);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually correct to pass the port as :PORT here? While they call the parameter "bind_address" the actual consumer of this value seems to expect just a plain port (or a list of ports).

If I make the following edit tests seem to behave identically, so we should probably add something exercising this, especially since most of our work is just glueing things together (we'd want to make sure that actually works):

  // DEITY, PLEASE GIVE ME THE STRENGTH TO ADD NEWLINES WHERE THEY MAKE SENSE.

  if (auto port = caf::get_as<broker::port>(cfg, "broker.metrics.port")) {
    BROKER_INFO("expose metrics on" << port->number());
    exposer_ = std::make_unique<prometheus::Exposer>(std::to_string(port->number()));
  }

In any case, it seems interesting that at least in the way we use their API there is no way to specify and actual endpoint to listen on (e.g., if we have multiple addresses).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bbannier
Copy link

Hi @Neverlord, this currently does not configure against Zeek's master since it does not export the prometheus targets (see #402 (comment)); could you adjust the PR for that? I am still not a fan for sneaking FetchContent into a Zeek project (#402 (comment)), but wouldn't block this as long as it is only for broker standalone builds. Anything else you have in mind before this can graduate out of draft status?

@Neverlord
Copy link
Member Author

@ckreibich the CI failures have nothing to do with the changes. PR-wise, this is done now. The Zeek-side PR is also green now.

@bbannier
Copy link

bbannier commented Jun 28, 2024

@ckreibich the CI failures have nothing to do with the changes. PR-wise, this is done now. The Zeek-side PR is also green now.

I retriggered the image creation for the fedora job and it now succeeds. For freebsd13 you need 25ec1e8 which should be on the master branch, can you rebase (would also resolved the merge conflicts)?

The prometheus-cpp library does not really play nice as a sub-module.
For one, the embedded civetweb prints to console when using build type
debug. We also must make sure to not install prometheus-cpp alongside
Zeek/Broker. Furthermore, prometheus-cpp has packages in several Linux
distributions and package maintainers should be able to have Zeek/Broker
pick up prometheus-cpp as system dependency.

To better separate prometheus-cpp from our targets, we introduce a small
CMake abstraction that will build the library separately to isolate it
from other targets. Then, we use `find_package` to pick up the (static)
library from our build directory. Further, by setting
`prometheus-cpp_ROOT`, users can cause CMake to pick up an existing
installation of the library.
Add a new function pair `checked` and `checked_deref` to the `internal`
namespace that check whether a nullable type (such as `std::shared_ptr`)
contains a value before forwarding or using it. Both functions throw an
exception with a customizable error message if the check fails.

We use the new functions to make sure member variables are initialized
using a non-null value or to make sure a registry pointer is valid
before dereferencing it.
@Neverlord Neverlord force-pushed the topic/neverlord/prometheus-cpp branch from 0a98379 to a759124 Compare June 28, 2024 15:35
@Neverlord
Copy link
Member Author

@bbannier thanks! Rebased and updated the Zeek-side PR as well.

@ckreibich
Copy link
Member

I'm closing this as it was superseded by #418, thanks Dominik!

@ckreibich ckreibich closed this Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert Broker to use prometheus-cpp registry from Zeek
4 participants