Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a semantic-less comparison for IR nodes. #4302

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

fruffy
Copy link
Collaborator

@fruffy fruffy commented Dec 19, 2023

IR nodes can not be part of sets or maps because they do not implement operator<. This makes maintaining a collection of nodes difficult (especially when pointer values change). This PR is an attempt to implement a general operator< for all nodes.

This PR is a pragmatic way to approach this problem, ideally ordering should be concretely specified in the compiler.

Currently, ordering is approached similarly to how equiv works for each node. We do not compare based on pointers, but based on the values of each node member. The tests contain a list of nonexhaustives example, for instance:

Type_Bits(16, false) is less than Type_Bits(32, false)
Type_Bits(16, false) is less than Type_Boolean()
Type_Bits(16, false) is less than Type_String()
Constant(Type_Bits(16), 1) is less than Constant(Type_Bits(16), 2)
Constant(Type_Bits(16), 1) is less than Constant(Type_Bits(32), 0)
Vector<Node>(Constant(Type_Bits(16), 1)) is less than Vector<Node>(Constant(Type_Bits(16), 2))
Vector<Node>(Constant(Type_Bits(16), 1), Constant(Type_Bits(16), 2)) is less than Vector<Node>(Constant(Type_Bits(16), 2))
Declaration_Variable("a", Type_Bits(16, false)) is less than Declaration_Variable("b", Type_Bits(16, false))
...

In general, ordering is deferred to the members of each IR node. Ordering priority is evaluated based on the order of the individual members of a class. Nodes themselves are sorted based on their node_type_name(). This function returns a cstring, which provides the same ordering guarantees as cstring does.

We should also implement hash operators for nodes, but this is something that is be handled separately.

@fruffy fruffy changed the title [WiP] Implement operator< for IR nodes. Implement operator< for IR nodes. Dec 20, 2023
@fruffy fruffy force-pushed the fruffy/ir_operator< branch 2 times, most recently from c6997a5 to cdd44a6 Compare December 21, 2023 07:32
@fruffy fruffy force-pushed the fruffy/ir_operator< branch 4 times, most recently from 011cf5d to a2e0aed Compare January 15, 2024 12:49
@vlstill
Copy link
Contributor

vlstill commented Jan 15, 2024

It seems to me like this would mean that < is inconsistent with ==. E.g. in https://github.com/p4lang/p4c/pull/4302/files#diff-ea1da9796c6c2b602731a169985ebd06ed51a4b37dc65b5f5c8e67cba889c974R85 the == works based on decid, but < ignores it. In general it should hold that !(a < b) && !(b < a) is equivalent with a == b. There might even be parts of the stdlib that count on it (e.g. the std::totally_ordered concept has it as semantic requirement. Therefore, having such an inconsistency could lead to bug-prone code.

I'd suggest defining a member function instead (e.g. compare), that will be consistent with equal. And then defining a comparator object suitable for using it in map<Node*, X> and set<Node*> (we would need to explicitly specify comparator even in your case, so I don't think this would be less readable).

@fruffy
Copy link
Collaborator Author

fruffy commented Jan 15, 2024

It seems to me like this would mean that < is inconsistent with ==. E.g. in https://github.com/p4lang/p4c/pull/4302/files#diff-ea1da9796c6c2b602731a169985ebd06ed51a4b37dc65b5f5c8e67cba889c974R85 the == works based on decid, but < ignores it. In general it should hold that !(a < b) && !(b < a) is equivalent with a == b. There might even be parts of the stdlib that count on it (e.g. the std::totally_ordered concept has it as semantic requirement. Therefore, having such an inconsistency could lead to bug-prone code.

I'd suggest defining a member function instead (e.g. compare), that will be consistent with equal. And then defining a comparator object suitable for using it in map<Node*, X> and set<Node*> (we would need to explicitly specify comparator even in your case, so I don't think this would be less readable).

The way operator== and equiv are implemented in the compiler today is honestly fairly bug-prone and confusing. It's been a constant source of bugs. It might make sense to rename equiv to isSemanticallyEquivalent and implement operator< as you suggested, with a utility function isSemanticallyLess.

@fruffy fruffy force-pushed the fruffy/ir_operator< branch 6 times, most recently from 3579104 to 39df9b8 Compare January 21, 2024 18:21
@fruffy fruffy changed the title Implement operator< for IR nodes. Implement a semantic less comparison for IR nodes. Jan 22, 2024
@fruffy fruffy force-pushed the fruffy/ir_operator< branch 2 times, most recently from dab3f29 to a065152 Compare January 22, 2024 20:02
@asl
Copy link
Contributor

asl commented Mar 5, 2024

Just some loud thinking. Now we are having an integer typeid for each node, so could order according to these. And only for same typeid we can use something "internal"

@fruffy
Copy link
Collaborator Author

fruffy commented Mar 5, 2024

Just some loud thinking. Now we are having an integer typeid for each node, so could order according to these. And only for same typeid we can use something "internal"

Yes, at least for base types and nodes! I still do need a form of semantic comparison on all the pointer members.

@fruffy fruffy changed the title Implement a semantic less comparison for IR nodes. Implement a semantic-less comparison for IR nodes. Mar 6, 2024
@fruffy fruffy force-pushed the fruffy/ir_operator< branch 3 times, most recently from 2a4b843 to 2388500 Compare March 8, 2024 21:20
@fruffy
Copy link
Collaborator Author

fruffy commented Mar 8, 2024

Just some loud thinking. Now we are having an integer typeid for each node, so could order according to these. And only for same typeid we can use something "internal"

I replaced node_type_name with typeId for the node comparisons. Any idea on how to handle the comparisons in general? There is a shallow and a deep comparison we can do. Shallow is only comparing the pointer members, deep is actually checking the contents of each member. Deep comparisons get tricky with containers.

@fruffy fruffy added the core Topics concerning the core segments of the compiler (frontend, midend, parser) label Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Topics concerning the core segments of the compiler (frontend, midend, parser)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants