Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[custom] EIGEN_ALIGN16 bad idea #5805

Open
mynickmynick opened this issue Sep 10, 2023 · 6 comments
Open

[custom] EIGEN_ALIGN16 bad idea #5805

mynickmynick opened this issue Sep 10, 2023 · 6 comments
Labels
status: triage Labels incomplete

Comments

@mynickmynick
Copy link
Contributor

mynickmynick commented Sep 10, 2023

the use of EIGEN_ALIGN16
seems to be a very bad idea. In particular for cache efficiency and multithreading.

I would either eliminate that and all similar ones or substitute them with something configurable at compile time
giving also the option to use, compile-time-configurable directives like
alignas(std::hardware_destructive_interference_size) (and here the topic of moving to C++17 goes on)

why are the definitions of the points so relying on alignment and they make use of union??
isn't this too hard coding and anti-OOP?

#define PCL_ADD_UNION_POINT4D
union EIGEN_ALIGN16 {
float data[4];
struct {
float x;
float y;
float z;
};
};

i see it is probably "required" by Eigen

@mynickmynick mynickmynick added the status: triage Labels incomplete label Sep 10, 2023
@mynickmynick
Copy link
Contributor Author

by the way anybody knows how to compile a branch with

#include <experimental/simd>

from the C++ lib SIMD extension??

@mvieth
Copy link
Member

mvieth commented Sep 15, 2023

the use of EIGEN_ALIGN16 seems to be a very bad idea. In particular for cache efficiency and multithreading.

How so? Because it may "waste" memory? Correct me if I am wrong, but to my knowledge, the cache lines are also aligned to some 2^n bytes.

why are the definitions of the points so relying on alignment and they make use of union?? isn't this too hard coding and anti-OOP?

What do you mean? Why is requiring alignment "hard coding and anti-OOP"?

#define PCL_ADD_UNION_POINT4D union EIGEN_ALIGN16 { float data[4]; struct { float x; float y; float z; }; };

i see it is probably "required" by Eigen

The alignment is useful so that the fast SIMD load and store operations can be used (see also https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/util/ConfigureVectorization.h#L27 )

@mynickmynick
Copy link
Contributor Author

mynickmynick commented Sep 16, 2023

How so? Because it may "waste" memory? Correct me if I am wrong, but to my knowledge, the cache lines are also aligned to some 2^n bytes.

I am studying this subject, it's a work in progress.
Not "to waste memory": The opposite. align as 16 probably pack data more than cache efficient. On my desktop MS pc
std::cout << std::hardware_destructive_interference_size<<"\n";
std::cout << std::hardware_constructive_interference_size<<"\n";

gives
64
64
It is not clear though if it would be convenient to give even a
alignas(std::hardware_destructive_interference_size)
on the single point structure
It would be probably more convenient to let the compiler align the single point (without alignas directives) and let the programmer give
alignas(std::hardware_destructive_interference_size) or
alignas(std::hardware_constructive_interference_size)
only on bigger containers
or let the programmer recompile PCL for his own purpouses and platforms with a configurable alignas on the points
(so he decides wether it is 16 or 32 or 64 or std::hardware_destructive_interference_size or std::hardware_constructive_interference_size)

That would be a matter of study

What do you mean? Why is requiring alignment "hard coding and anti-OOP"?

Requiring alignment in general might go, but does not seem so in the way it is done here. How it's done here not only seems anti OOP but also against structured programming. PCL relies on alignment used together with union{} making it based on float data[4] access, looks very much hard coded to me. Why accessing the structure fields with a overlayed union float[4] array?? For Eigen? OK I understand but isn't there a smarter and structured general purpouse way to access Eigen SIMD without having to hard code like this? This also makes it much harder to extend to optional use of double for xyz coords. And please do not tell me again we don't need doubles and that doubles slow down because I won't even listen!!!! ;))

@daohu527
Copy link
Contributor

I think it's a XY problem!

What we are concerned about here is whether alignment has an impact on cache efficiency, so do not introduce other problems!!!

If not then we can go to other topic about its impact

@mynickmynick
Copy link
Contributor Author

mynickmynick commented Sep 22, 2023

the two problems are strictly correlated. extension to double is obstacled by hard coding with fixed (not configurable) alignment AND use of union{} with float[4]

I really have no idea why you say XY instead of XYZ

@mvieth
Copy link
Member

mvieth commented Sep 27, 2023

EIGEN_ALIGN16 was never meant to create a memory layout beneficial for multi-threading.
Can you describe more, what you are working on? You use multi-threading to compute something, and store the results from each thread in a pcl::PointXYZ? Are they in a vector or something? Perhaps if I understand the problem better, I can see what the best solution would be.

Requiring alignment in general might go, but does not seem so in the way it is done here. How it's done here not only seems anti OOP but also against structured programming.

Do we have a different understanding of what OOP and structured programming means? I still don't see how requiring alignment goes against these paradigms.

PCL relies on alignment used together with union{} making it based on float data[4] access, looks very much hard coded to me.

This is done to enable the use of the fast SIMD instructions, which need aligned memory (load and store) and packs of four floats. I don't know why you call it hard coded, you can define a custom point type and use that with pcl::PointCloud and all other PCL classes.

Why accessing the structure fields with a overlayed union float[4] array?? For Eigen? OK I understand but isn't there a smarter and structured general purpouse way to access Eigen SIMD without having to hard code like this?

SIMD needs packs of 4 floats to work efficiently. But if you see an alternative, feel free to suggest it.

This also makes it much harder to extend to optional use of double for xyz coords. And please do not tell me again we don't need doubles and that doubles slow down because I won't even listen!!!! ;))

I assume you are joking with the last part, but that is still an odd thing to say.
So far I have not seen any problem that is not solvable with float (and perhaps some tricks like demeaning if the data is unfavourable). Still, I am not opposed to making PCL more double-friendly, if the benefit justifies the needed work. But for that, we would probably need to introduce a new point type where XYZ are double and there is a double data[4]. And it would probably have to be 32-byte aligned to use the aligned load and store instructions. The existing PCL_ADD_UNION_POINT4D would be unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: triage Labels incomplete
Projects
None yet
Development

No branches or pull requests

3 participants