New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: detect inline
keyword
#13355
build: detect inline
keyword
#13355
Conversation
73ff684
to
f6086a6
Compare
The configure script already checks for |
inline
keyword in libcurl/curlinline
keyword
This is exactly what is implemented in this PR. |
Except:
... which I think you should drop. We already use |
What is the advantage of the CMake-level check compared to the detection done in C? With this patch, curl would have 4 |
I propose to move the inline feature check to C, by using this snippet: #ifndef CURL_INLINE
#if defined(__GNUC__) && (__GNUC__ >= 3)
#define CURL_INLINE __inline__
#elif defined(_MSC_VER) && (_MSC_VER >= 1400)
#define CURL_INLINE __inline
#elif (defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L)) || \
(defined(__cplusplus) && (__cplusplus >= 199711L))
#define CURL_INLINE inline
#else
#define CURL_INLINE
#endif
#endif This covers the vast majority of compilers without the complexity (and build performance drop) of this PR. According to https://github.com/nemequ/hedley/blob/8fb0604a8095f6c907378cc3f0391520ae843f6f/hedley.h#L1528-L1553 (logic also re-used by Brotli) there exists a few further exotic compilers, but I suggest explicitly adding them when curl actually supports those, and if a specific need arises. It needs to adapt this --- curl_sha512_256.c-ORI
+++ curl_sha512_256.c
@@ -338,7 +338,7 @@
/* Defined as a function. The macro version may duplicate the binary code
* size as each argument is used twice, so if any calculation is used
* as an argument, the calculation could be done twice. */
-static MHDX_INLINE curl_uint64_t
+static CURL_INLINE curl_uint64_t
MHDx_rotr64(curl_uint64_t value, unsigned int bits)
{
bits %= 64; And to handle two previous uses in --- easy_lock.h-ORI
+++ easy_lock.h
@@ -69,7 +69,7 @@
#endif
-static inline void curl_simple_lock_lock(curl_simple_lock *lock)
+static CURL_INLINE void curl_simple_lock_lock(curl_simple_lock *lock)
{
for(;;) {
if(!atomic_exchange_explicit(lock, true, memory_order_acquire))
@@ -88,7 +88,7 @@
}
}
-static inline void curl_simple_lock_unlock(curl_simple_lock *lock)
+static CURL_INLINE void curl_simple_lock_unlock(curl_simple_lock *lock)
{
atomic_store_explicit(lock, false, memory_order_release);
} |
Right. The only It looks like libcurl and curl still could be built without autotools and cmake. See
It's easy: C macro detection is limited and imprecise, to be on the safe side it prefers to drop CMake detection always gives the precise result. Moreover, when standard
Just three detections: CMake and autotools (the main build systems), |
Is there an evidence that the "more precise" detection in CMake/autotools has actual benefits that justifies the complexity and build-performance hit it causes? I'd like learn which common case is missed by the proposed alternative in #13355 (comment). Once we know those, they are fairly simple to add.
Above proposal fixes that too by replacing that Unless we're talking about a genuinly old or exotic compiler hitting that |
@vszakats If curl would care about the most common compilers, then it would be no need for any checks for If you want to simplify the solution, then just use #undef inline
#define inline /* empty */ The would cover 100% of all cases, exotic or not. The current proposed solution just balances autotools and cmake builds, which is desired, as I understand (at least there is a special CI runner that compare results for autotools and cmake, however it is pretty limited as the only one compiler and configuration is checked) and adds the fallback path for other used build systems. |
You didn't answer the question of what is the benefit of going overboard with detecting a widely supported compiler feature using 4 complex methods, of which 2 are slow, and 2 are almost exact copies of each other, also most likely unnecessary if the CMake and autotools methods are 100% perfect as you say. This, over a simple, and easy to customize method that covers likely 99.99% of systems out there (..and 100% with customization). Also noting that this compiler feature is used ONCE in the whole codebase (only for non-OpenSSL builds.) (The two pre-existing bare uses in To put it differently, why is curl needing this elaborate detection dance, when Brotli is happy with pure preprocessor detection, while also using |
f6086a6
to
a756591
Compare
You didn't read my answers as all the points have been already answered. I'll repeat for clarity:
I do not see any single "complex" detection method here. All of them are quite simple, readable and some have comments for additional clarity. Summary: cmake and autotools now matches each other. When they are not used, the code is still compliable regardless of Regarding "slow" detections. autotools detection is already here, for a long time. Cmake detection "slow down" the build for less than a second, while result matched autotools finally.
It is not possible (or at least not productive) to support all compilers just with macros.
One more time: autotools detection is used for more than 16 years (since 12-06-2007), cmake detection is now up to par. Plus the fallback path, with weaker detection by macros when both are not used (actually just moved from C file to the header). I can't tell for other projects, but I can name hundreds (if not thousands) projects that use For limited usage of |
inline is generally of questionable value as it second guesses what the compiler can do, and should therefore be used sparsely and only if there is a proven/tested obvious benefit. I still think this PR is overdoing it. I think the fix should be done without having to touch C code. Ie fix it for cmake, leave everything else. |
a756591
to
e522cfc
Compare
@bagder done, with I can remove it as well to further reduce the size of the PR as code will be OK, but probably suboptimal. |
|
e522cfc
to
f294110
Compare
@bagder I've split the commits. The first commit is pure cmake. |
Test failures seems to be unrelated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we make the build setup deal with inline
by itself, source code can assume it works. We already have code in libcurl that does this (in easy_lock.h
) so adding a check here seems inconsistent/unnecessary.
f294110
to
4d13f17
Compare
@bagder Done, with further reducing the code in the |
As I pointed out previously, curl still supports a few other build systems. Such build systems have no checks for the The PR was updated as requested by @bagder. |
I believe that if detection can be reasonably made in C, we should not move and scatter/duplicate detection logic to the build level. Doing the latter has these effects:
I don't agree with merging this PR in its current form. edit: I think curl should strive to keep portable C89 code and avoid doing low-level magic with hand-optimized functions, like a crypto-backend would do. When e.g. crypto perf is critical curl does best to rely on existing libraries squeezing out max performance where necessary, for those who need that, and where it fixes real bottlenecks backed by measurements. Curl already uses SHA-512/256 from OpenSSL (and forks), covering a significant portion of builds. If performance here is a critical bottleneck, adding support for more backends would seem a better long term solution. (wolfSSL supports it too) |
Your numbers and assumptions are obviously wrong.
Stable numbers, best 5 out of 10 tries (the other five are 0.2-0.6 seconds longer, randomly). I also tried
The "slow down" is around 0.03 seconds. Do you really think this critical slow down needs to be optimised? Added: numbers were taken on the laptop with 25W CPU TDP. Desktops with proper cooling may have better numbers.
I think it's obvious that a two-lines file without any includes (the test file) is compiled much faster than a full additional C file for curl or lib.
Could you please explain, how the code used in some other project proves some particular needs or absence of the particular needs for this project? Added: Could you please stop ignoring other's arguments? It was answered two times before: here and here |
@bagder If these 0.03 seconds / 0.5 seconds delays are important, and you still think that curl must use C-only detection, I'll modify this PR with removal of autotools detection, the matching lines in cmake header template and other related things. Note: the autotools builds will be modified (by removing checks added 16 years ago), and the code will rely on C-only code. |
I prefer the C solution for these reasons:
|
4d13f17
to
9e63cc5
Compare
@bagder Reworked again, as you requested. |
9e63cc5
to
c259772
Compare
Rebased. |
Regarding the CMake/autotools detection overhead: This hits Windows the most and macOS next. Linux is generally (much) faster, even when cross-compiling. After dealing with these for nearly 2 years, my numbers for sure are not "obviously wrong". They can be seen in CI runs, or any non-bleeding edge, non-Linux, local machine. Probably more with cross-builds and/or a GNU toolchain. Either way, thanks for your updates. Aside from a minor update request, it LGTM. |
c259772
to
2cac423
Compare
inline
keywordinline
keyword
After dealing with it for nearly 16 years, I can absolutely confirm that MSys2/Cygwin is the slowest detection, especially with autotools that are designed to make a lot of short compiler shots. macOS is much closer to Linux and *BSD in terms of configuration phase length. The key message was that typically just a single form of
Corrected as requested. |
Please wait for CI to complete. Previous try had VS2022 (three different runners) failed on the same test (345), which is suspicious. |
Test 345 has been fixed (#13458). It should be gone if you rebase on master.
CMake does the same under the hood. With somewhat fewer tests in our case. "generate" takes 40 seconds on my macOS machine. (58 seconds with autotools.) FWIW 2 years was solely dealing with curl CMake builds. |
This C-header only implementation keeps failing on |
OK. |
Thanks! |
As I discovered during work on #12897, libcurl does not support use of
inline
keyword because libcurl can be built with unusual combination of old compiler and cmake.When built by old compiler and autotools (configure),
inline
macro is defined to a suitable replacement (or empty string) if need.The missing part is cmake checks for
inline
support.This PR adds support for cmake detection of
inline
.As a fallback path,
inline
compiler-time detection is implemented incurl_setup.h
, so even if build system does not check for inline support for any reason the header will substitute a suitable replacement.With this PR merged, libcurl and curl code can freely use
inline
keyword.