Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verification failures are printed in label order, making investigating failures difficult #632

Closed
Alan-Jowett opened this issue May 10, 2024 · 7 comments

Comments

@Alan-Jowett
Copy link
Contributor

Failures in large eBPF programs with complex control flow can be hard to analyze. Of particular concern is that an assertion failure can result in invariants being removed from the post-invariant set, which causes a cascade of subsequent assertion failures which strip invariants.

Locating the initial invariant violation that triggers this cascade of failures can be challenging assuming the failures are printed in label order (as labels don't necessarily match the control flow).

One possible solution is to provide an option to print errors in WTO (weak topographical order), which will tend to place initial failures earlier in the list of failures.

A recent case in point is a verification failure in a Cilium program, where a single correlated branch failure triggers around 500 subsequent failures. Because of significant inlining and branch folding by LLVM, the initial failure was present in the middle of the list of 500 failures making it difficult to locate.

When sorting the errors in WTO, the failure is close to the top of the list.

@elazarg
Copy link
Collaborator

elazarg commented May 11, 2024

The flag --assume-assert is designed to help in this exact case. Doesn't it?

@elazarg
Copy link
Collaborator

elazarg commented May 22, 2024

@Alan-Jowett did you experiment with that flag?

@Alan-Jowett
Copy link
Contributor Author

Alan-Jowett commented May 22, 2024

Thanks for the feedback. What does this flag do? Should it be on by default when performing the analysis? I will try it out.

@elazarg
Copy link
Collaborator

elazarg commented May 22, 2024

The flag forces the invariants to match the assertions, even if they did not. So errors are not repeated and not compounded.

It used to be the default. I don't remember the reasoning behind disabling it, but in part it's because expected failures in tests stress the code less (since many invariants drop to bottom immediately).

Please let me know if the flag helps.

@Alan-Jowett
Copy link
Contributor Author

That flag is awesome!

From 507 errors -> 1 error, and it's the actual relevant error that is causing the problem. This is even better than my approach. Is there any harm in making this the default in eBPF-for-Window's calls to the verifier?

I will go ahead and close this PR unless you see value in it.

@elazarg
Copy link
Collaborator

elazarg commented May 22, 2024

I think it should be on by default during debugging, and off by default in tests.

@Alan-Jowett
Copy link
Contributor Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants