Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Benchmark] 4. C/C++ benchmark code is not fair #4

Open
ujos opened this issue Aug 27, 2022 · 4 comments
Open

[Benchmark] 4. C/C++ benchmark code is not fair #4

ujos opened this issue Aug 27, 2022 · 4 comments

Comments

@ujos
Copy link
Contributor

ujos commented Aug 27, 2022

The C++ code of the benchmark in the "[Why learn C++ if I know Python (Toy Example)" is not fair. Because code operate on local variables which do not have any aliases and a is a regular C-array, C++ compiler can remove for() loop completely and just set s into some compile time value.

@ujos ujos changed the title [Benchmark] 4. C/C++ implementation [Benchmark] 4. C/C++ benchmark code is not fair Aug 27, 2022
@burlachenkok
Copy link
Owner

Thanks! There are two things...

  1. https://godbolt.org/ with using gcc-7.5.0 with flags: -O3 -Wall --std=c++11 for code snippet "4. C/C++ benchmark":
  • Preserves integer to double conversion with CVTSI2SD (for x86_64)
  • But really that optimization aspect making get rid of that stack allocated memory flat C array from final binary.

So you're correct it's not fair at least by 50% because e.g. Python does not have compiler optimization mechanisms....

  1. At the same time there are various compiler optimization tricks that compilers can do:
    (https://ocw.mit.edu/courses/6-172-performance-engineering-of-software-systems-fall-2018/resources/mit6_172f18_lec9/)

The absence of compiler in your programmig enviroment is your problem.

====

In that particular piece of code compiler did:

  • Replace loaded values with using register.
  • All increments happens with using register
  • Remove dead code with using local stack variable (a).
    So in that execution we used compiler optimization.

Conclusion:
I think t would be nice to demonstrate speed with using "-O0" and "-O3" and elaborate why there is a difference. And highlight that there is a point of view "that benchmark is no fair" due to that C++ use compiler optimization tricks.

@ujos
Copy link
Contributor Author

ujos commented Aug 27, 2022

At least it worth to note, that C++ can remove a[] from the binary, because it can :)

@ujos
Copy link
Contributor Author

ujos commented Aug 30, 2022

I tried to compile that sample without optimization using MSVC++. Application fails to start as it cannot allocate 10MB on the stack

@ujos
Copy link
Contributor Author

ujos commented Aug 30, 2022

In case if I allocate the a[] as a global static variable, the code compiled by MSVC is two times slower. In case if I compile the code using GNU C++, that change does not affect performance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants