-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Minimal unwinding information (DWARF CFI) checker #145633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…mple CFI analysis
…/Write conditions
- Changing the register and not informing it using cfi directives - Changing the CFA and report it wrong - Load a register from another register stored location
…d `FunctionUnitAnalyzer`. Define `FunctionUnitUnwindInfoAnalyzer` that executes the Analysis during the stream. The main reasons for doing this is: - This is compatible with other use cases of `MCStreamer` implementers (e.g. `MCAsmStreamer` and `MCAssembler`). - This approach result in on the fly analysis, if it was implemented like `RecordStreamer`, then all the instruction of the whole program have to be copied and be stored in the memory.
Please help me in:
It's my first non-mechanical PR, so any suggestion, even little is appreciated. |
You can test this locally with the following command:git-clang-format --diff HEAD~1 HEAD --extensions cpp,h -- llvm/include/llvm/UnwindInfoChecker/FunctionUnitAnalyzer.h llvm/include/llvm/UnwindInfoChecker/FunctionUnitStreamer.h llvm/include/llvm/UnwindInfoChecker/FunctionUnitUnwindInfoAnalyzer.h llvm/include/llvm/UnwindInfoChecker/UnwindInfoAnalysis.h llvm/include/llvm/UnwindInfoChecker/UnwindInfoState.h llvm/lib/UnwindInfoChecker/FunctionUnitAnalyzer.cpp llvm/lib/UnwindInfoChecker/FunctionUnitStreamer.cpp llvm/lib/UnwindInfoChecker/FunctionUnitUnwindInfoAnalyzer.cpp llvm/lib/UnwindInfoChecker/Registers.h llvm/lib/UnwindInfoChecker/UnwindInfoAnalysis.cpp llvm/lib/UnwindInfoChecker/UnwindInfoState.cpp llvm/include/llvm/DebugInfo/DWARF/DWARFCFIProgram.h llvm/include/llvm/DebugInfo/DWARF/DWARFDebugFrame.h llvm/tools/llvm-mc/llvm-mc.cpp View the diff from clang-format here.diff --git a/llvm/lib/UnwindInfoChecker/FunctionUnitStreamer.cpp b/llvm/lib/UnwindInfoChecker/FunctionUnitStreamer.cpp
index 54389aa7f..775ca35eb 100644
--- a/llvm/lib/UnwindInfoChecker/FunctionUnitStreamer.cpp
+++ b/llvm/lib/UnwindInfoChecker/FunctionUnitStreamer.cpp
@@ -17,7 +17,8 @@
using namespace llvm;
-std::pair<unsigned, unsigned> CFIFunctionFrameStreamer::updateDirectivesRange() {
+std::pair<unsigned, unsigned>
+CFIFunctionFrameStreamer::updateDirectivesRange() {
auto Frames = getDwarfFrameInfos();
unsigned CurrentCFIDirectiveIndex = 0;
if (hasUnfinishedDwarfFrameInfo()) {
@@ -42,7 +43,8 @@ void CFIFunctionFrameStreamer::updateAnalyzer() {
return;
}
- const MCDwarfFrameInfo *LastFrame = &getDwarfFrameInfos()[FrameIndices.back()];
+ const MCDwarfFrameInfo *LastFrame =
+ &getDwarfFrameInfos()[FrameIndices.back()];
auto DirectivesRange = updateDirectivesRange();
ArrayRef<MCCFIInstruction> Directives;
@@ -62,7 +64,7 @@ void CFIFunctionFrameStreamer::updateAnalyzer() {
}
void CFIFunctionFrameStreamer::emitInstruction(const MCInst &Inst,
- const MCSubtargetInfo &STI) {
+ const MCSubtargetInfo &STI) {
updateAnalyzer();
LastInstruction = Inst;
}
diff --git a/llvm/lib/UnwindInfoChecker/UnwindInfoAnalysis.cpp b/llvm/lib/UnwindInfoChecker/UnwindInfoAnalysis.cpp
index 872793015..ab615fbe6 100644
--- a/llvm/lib/UnwindInfoChecker/UnwindInfoAnalysis.cpp
+++ b/llvm/lib/UnwindInfoChecker/UnwindInfoAnalysis.cpp
@@ -76,9 +76,9 @@ getUnwindRuleRefReg(const dwarf::UnwindTable::const_iterator &UnwindRow,
}
}
-DWARFCFIAnalysis::DWARFCFIAnalysis(MCContext *Context,
- MCInstrInfo const &MCII, bool IsEH,
- ArrayRef<MCCFIInstruction> Prologue)
+DWARFCFIAnalysis::DWARFCFIAnalysis(MCContext *Context, MCInstrInfo const &MCII,
+ bool IsEH,
+ ArrayRef<MCCFIInstruction> Prologue)
: Context(Context), MCII(MCII), MCRI(Context->getRegisterInfo()),
State(Context), IsEH(IsEH) {
@@ -121,7 +121,7 @@ DWARFCFIAnalysis::DWARFCFIAnalysis(MCContext *Context,
}
void DWARFCFIAnalysis::update(const MCInst &Inst,
- ArrayRef<MCCFIInstruction> Directives) {
+ ArrayRef<MCCFIInstruction> Directives) {
const MCInstrDesc &MCInstInfo = MCII.get(Inst.getOpcode());
auto MaybePrevRow = State.getCurrentUnwindRow();
|
I see the following plan fit for splitting the implementation:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite a large patch. It may make sense to split out parts that aren't required for complete functionality into their own patches.
For example, making addInstruction()
public in DWARFCFIProgram.h, can probably be its own patch. I think there are several places of where you're just slightly refactoring the existing code.
As for the rest ... Its rather large, and reviewing incremental changes as you add functionality would make it significantly easier to provide high quality reviews.
#define LLVM_UNWINDINFOCHECKER_FUNCTIONUNITANALYZER_H | ||
|
||
#include "llvm/ADT/ArrayRef.h" | ||
#include <cstdio> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't appear to be needed.
|
||
} // namespace llvm | ||
|
||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check your editor settings, since github seems unhappy w/ the lack of newline here.
std::vector<unsigned> FrameIndices; | ||
|
||
unsigned LastDirectiveIndex; | ||
std::optional<MCInst> LastInstruction; | ||
std::unique_ptr<FunctionUnitAnalyzer> Analyzer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We normally put private fields at the bottom of the class. Also please consider field ordering. using unsigned as the second field is probably going to end up w/ padding bytes.
#include "FunctionUnitAnalyzer.h" | ||
#include "UnwindInfoAnalysis.h" | ||
#include "llvm/ADT/ArrayRef.h" | ||
#include <cstdio> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include <cstdio> |
I can't imagine why you'd need cstdio
...
#include "llvm/MC/MCStreamer.h" | ||
#include "llvm/MC/MCSubtargetInfo.h" | ||
#include "llvm/MC/TargetRegistry.h" | ||
#include <set> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://llvm.org/docs/ProgrammersManual.html#set-like-containers-std-set-smallset-setvector-etc
Probably using an LLVM ADT is more appropriate.
return; | ||
} | ||
|
||
const auto *LastFrame = &getDwarfFrameInfos()[FrameIndices.back()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: generally prefer to use a real type over auto
. using
statements are a good option if you have some types that are hard to spell, plus that usually makes the code easier to read anyway. The style guide has more to say here: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable
} | ||
} | ||
|
||
std::sort(SuperRegs.begin(), SuperRegs.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::sort(SuperRegs.begin(), SuperRegs.end()); | |
llvm::sort(SuperRegs.begin(), SuperRegs.end()); |
Default to llvm::sort
over std::sort
: https://llvm.org/docs/CodingStandards.html#beware-of-non-deterministic-sorting-order-of-equal-elements
case dwarf::UnwindLocation::Location::Constant: | ||
case dwarf::UnwindLocation::Location::Unspecified: | ||
case dwarf::UnwindLocation::Location::DWARFExpr: | ||
// TODO here should look into expr and find the registers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO here should look into expr and find the registers. | |
// TODO: here should look into expr and find the registers. |
Most tooling works on TODO:
or FIXME:
. Please update the other instances.
//===----------------------------------------------------------------------===// | ||
/// | ||
/// \file | ||
/// This file declares CFIFunctionFrameReceiver class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's difficult to read all of this new code without an overview of what all these classes are for. Could you expand the comments in your new header files to describe the purpose of each class, not just its name?
Here, for example, it looks as if CFIFunctionFrameReceiver
does basically nothing, because it's just a base class. Some other class (which?) will be given a pointer to an instance of this base class, and will send it information about a function it's scanning, but really the pointer will point to a subclass which actually does something with the information.
But I had to look in the implementation source file to confirm that, by seeing that all the methods are empty.
@@ -519,8 +526,16 @@ int main(int argc, char **argv) { | |||
std::unique_ptr<MCInstrInfo> MCII(TheTarget->createMCInstrInfo()); | |||
assert(MCII && "Unable to create instruction info!"); | |||
|
|||
std::unique_ptr<FunctionUnitUnwindInfoAnalyzer> FUUIA( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's a FunctionUnitUnwindInfoAnalyzer
? The class defined in FunctionUnitUnwindInfoAnalyzer.h
is called something else. Surely this code won't even compile in its current state?
Separately from that, you can make this unique_ptr
without having to repeat the enormously long class name twice by writing
auto FUUIA = std::make_unique<ClassName>(parameters);
@@ -0,0 +1,49 @@ | |||
//===----------------------------------------------------------------------===// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks as if you renamed all the classes defined in these header files, but left the file names themselves unchanged, so that now the file names don't match the class names and there's twice as much to remember?
} | ||
|
||
std::optional<dwarf::CFIProgram> | ||
DWARFCFIState::convert(MCCFIInstruction Directive) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this function returning a std::optional
, when it doesn't ever seem to return nullopt
? As far as I can see, you always make a CFIProgram
at the start, and always return it at the end. You may or may not put an instruction into it, but you still return the CFIProgram
even if it's empty.
return; | ||
} | ||
Table.insert(Table.end(), NewRows.begin(), NewRows.end()); | ||
Table.push_back(Row); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this function at all, I'm afraid. It looks as if you first make Row
by copying the last entry already in Table
(if any); then you parse it into some more rows that you append to Table
; and then you push Row
on to the end of Table
again. Wasn't it in Table
already? Isn't that where you got it from in the first place?
if (!MaybeLLVMReg) { | ||
assert(PrevLoc == NextLoc && | ||
"The dwarf register does not have a LLVM number, so the unwind info " | ||
"for it should not change"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are all of these checks assertions, rather than Context->reportWarning
? Is this information expected to be consistent in all of these ways because your code constructed it itself?
switch (PrevLoc.getLocation()) { | ||
case dwarf::UnwindLocation::Same: | ||
case dwarf::UnwindLocation::RegPlusOffset: | ||
if (Writes.count(MaybePrevRefReg.value())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come you don't need to check whether MaybePrevRefReg
has a value at all in this test?
Context->reportWarning( | ||
Inst.getLoc(), | ||
formatv("unknown change happened to %{0} unwinding rule structure", | ||
RegName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what's going on here, I'm afraid. From this code it looks as if, any time PrevLoc != NextLoc
, we report one of the two warnings above, and it's only a question of which one. But that makes no sense – surely a register's location should change sometimes, e.g. when we push it on the stack and update the CFI to record that we did so. And one of your tests (spill-two-reg.s
) looks as if it's got a push instruction that passes this test. But I don't see how it passes this test.
Can you explain how this code is supposed to tell the difference between a correct and an incorrect way to push a register and update its CFI?
if (PrevLoc.getLocation() == NextLoc.getLocation()) { | ||
Context->reportWarning( | ||
Inst.getLoc(), | ||
formatv("unknown change happened to %{0} unwinding rule values", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These %
prefixes on register names in your error messages are very x86-specific. In Arm, or many other architectures, register names aren't prefixed with %
. Even in x86 they aren't, if you use the Intel rather than AT&T syntax. I think it would be better to leave the %
off completely.
@@ -0,0 +1,43 @@ | |||
# RUN: llvm-mc %s --validate-cfi --filetype=null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks as if this test is supposed to run without producing any warnings. Would it be a good idea to run FileCheck anyway, with a CHECK-NOT: warning:
to make sure a warning doesn't show up by accident?
This PR adds a minimal version of
UnwindInfoChecker
described in here.This implementation looks into the modified registers by each instruction and checks:
This implementation does not support DWARF expressions and treats them as unknown unwinding rules.