Skip to content

Recover ABI from binary level #123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ailrst opened this issue Oct 25, 2023 · 1 comment
Open

Recover ABI from binary level #123

ailrst opened this issue Oct 25, 2023 · 1 comment

Comments

@ailrst
Copy link
Contributor

ailrst commented Oct 25, 2023

Arm defines an ABI which we can realistically assume holds for global functions in dynamically linked libraries and programs.

We can use this to recover parameters for calls at the IL level

  • Can identify which registers are in a stored-before-read state at time of call() to identify arity of procedures and narrow the set of possible indirect call targets https://ieeexplore.ieee.org/document/7546543/
  • For pulling out functions like malloc() and pthread_create() we want to be able to pull function-call arguments out of the VSA/constant prop, currently the implementation is a bit ad-hoc and can be generalized
  • Using memory regions, we want to be able to write specifications at a slightly higher level than registers
  • To allow interprocedural analyses to use passed parameters as context
@l-kent
Copy link
Contributor

l-kent commented Oct 25, 2023

This should probably be called 'recovering parameter information' or something similar, since that's the real idea here? It is also conflating a fair few different problems. The most relevant part of the doc is this: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#the-base-procedure-call-standard - specifically 6.1 and 6.8.

Improving the indirect call resolution through an adaptation (since the paper is for x86) of the arity-based analysis in the 'A Tough Call' paper you've linked seems like it will probably only be useful for cases where the VSA just falls over. It's something that may be worth considering, but that should be part of a broader consideration of the VSA's strengths and weaknesses.

Pulling function parameters out of the VSA/constant propagation for malloc() etc. via a more generalised process seems like it would just convolute things, because it sounds like that would mean having the C-level function signature stored, recreating the parameter passing according to the ABI, and then using that to map the C-level parameters to the registers? But we already know which registers correspond to the parameters for those cases, I'm not sure what the point of a reconstruction is there.

Writing specifications at a higher level than registers should be part of a broader high-level specifications approach as described in #83.

What is the idea behind recovering parameters for calls? To directly reconstruct the passed parameters (which can be on the stack or in R0-7 and V0-7) we'd need the original C function signature. It's probably possible to infer which registers (out of R0-7 and V0-7) and stack addresses have already been written to and therefore contain parameters, which would be enough for context with the interprocedural analyses? I'm not sure if more could be done than that though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants