Skip to content

Commit 2178a87

Browse files
committed
Add README
1 parent 9f525e0 commit 2178a87

File tree

1 file changed

+126
-0
lines changed

1 file changed

+126
-0
lines changed

README.md

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# WARP
2+
3+
**WARP** provides a common format for transferring and applying function information across binary analysis tools.
4+
5+
## WARP Integrations
6+
7+
### Binary Ninja
8+
9+
WARP integration is available as an [open source](https://github.com/Vector35/binaryninja-api/tree/dev/plugins/warp) first-party plugin for [Binary Ninja] and as such ships by default.
10+
11+
## Function Identification
12+
13+
Function identification is the main way to interact with **WARP**, allowing tooling to utilize **WARP**'s dataset to identify
14+
common functions within any binary efficiently and accurately.
15+
16+
### Integration Requirements
17+
18+
To integrate with **WARP** function matching you must be able to:
19+
20+
1. Disassemble instructions
21+
2. Identify basic blocks that make up a function
22+
3. Identify register groups with implicit extend operation
23+
4. Identify relocatable instructions (see [What is considered a relocatable operand?](#what-is-considered-a-relocatable-operand))
24+
25+
### Creating a Function GUID
26+
27+
The function GUID is the UUIDv5 of the basic block GUID's (sorted highest to lowest start address) that make up the function.
28+
29+
#### Example
30+
31+
Given the following sorted basic blocks:
32+
33+
1. `036cccf0-8239-5b84-a811-60efc2d7eeb0`
34+
2. `3ed5c023-658d-5511-9710-40814f31af50`
35+
3. `8a076c92-0ba0-540d-b724-7fd5838da9df`
36+
37+
The function GUID will be `7a55be03-76b7-5cb5-bae9-4edcf47795ac`.
38+
39+
##### Example Code
40+
41+
```py
42+
import uuid
43+
44+
def uuid5(namespace, name_bytes):
45+
"""Generate a UUID from the SHA-1 hash of a namespace UUID and a name bytes."""
46+
from hashlib import sha1
47+
hash = sha1(namespace.bytes + name_bytes).digest()
48+
return uuid.UUID(bytes=hash[:16], version=5)
49+
50+
function_namespace = uuid.UUID('0192a179-61ac-7cef-88ed-012296e9492f')
51+
bb1 = uuid.UUID("036cccf0-8239-5b84-a811-60efc2d7eeb0")
52+
bb2 = uuid.UUID("3ed5c023-658d-5511-9710-40814f31af50")
53+
bb3 = uuid.UUID("8a076c92-0ba0-540d-b724-7fd5838da9df")
54+
function = uuid5(function_namespace, bb1.bytes + bb2.bytes + bb3.bytes)
55+
```
56+
57+
#### What is the UUIDv5 namespace?
58+
59+
The namespace for Function GUID's is `0192a179-61ac-7cef-88ed-012296e9492f`.
60+
61+
### Creating a Basic Block GUID
62+
63+
The basic block GUID is the UUIDv5 of the byte sequence of the instructions (sorted in execution order) with the following properties:
64+
65+
1. Zero out all instructions containing a relocatable operand.
66+
2. Exclude all NOP instructions.
67+
3. Exclude all instructions that set a register to itself if they are effectively NOPs.
68+
69+
#### When are instructions that set a register to itself removed?
70+
71+
To support hot-patching we must remove them as they can be injected by the compiler at the start of a function (see: [1] and [2]).
72+
This does not affect the accuracy of the function GUID as they are only removed when the instruction is a NOP:
73+
74+
- Register groups with no implicit extension will be removed (see: [3] (under 3.4.1.1))
75+
76+
For the `x86_64` architecture this means `mov edi, edi` will _not_ be removed, but it _will_ be removed for the `x86` architecture.
77+
78+
#### What is considered a relocatable operand?
79+
80+
An operand that is used as a pointer to a mapped region.
81+
82+
For the `x86` architecture the instruction `e8b55b0100` (or `call 0x15bba`) would be zeroed.
83+
84+
#### What is the UUIDv5 namespace?
85+
86+
The namespace for Basic Block GUID's is `0192a178-7a5f-7936-8653-3cbaa7d6afe7`.
87+
88+
### Function Constraints
89+
90+
Function constraints allow us to further disambiguate between functions with the same GUID, when creating the functions we store information about the following:
91+
92+
- Called functions
93+
- Caller functions
94+
- Adjacent functions
95+
96+
Each entry in the lists above is referred to as a "constraint" that can be used to further reduce the number of matches for a given function GUID.
97+
98+
##### Why don't we require matching on constraints for trivial functions?
99+
100+
The decision to match on constraints is left to the user. While requiring constraint matching for functions
101+
from all datasets can reduce false positives, it may not always be necessary. For example, when transferring functions
102+
from one version of a binary to another version of the same binary, not matching on constraints for trivial functions
103+
might be acceptable.
104+
105+
## Comparison of Function Recognition Tools
106+
107+
### WARP vs FLIRT
108+
109+
The main difference between **WARP** and **FLIRT** is the approach to identification.
110+
111+
#### Function Identification
112+
113+
- **WARP** the function identification is described [here](#function-identification).
114+
- **FLIRT** uses incomplete function byte sequence with a mask where there is a single function entry (see: [IDA FLIRT Documentation] for a full description).
115+
116+
What this means in practice is **WARP** will have less false positives based solely off the initial function identification.
117+
When the returned set of functions is greater than one, we can use the list of [Function Constraints](#function-constraints) to select the best possible match.
118+
However, that comes at the cost of requiring a computed GUID to be created whenever the lookup is requested and that the function GUID is _**always**_ the same.
119+
120+
121+
[1]: https://devblogs.microsoft.com/oldnewthing/20110921-00/?p=9583
122+
[2]: https://devblogs.microsoft.com/oldnewthing/20221109-00/?p=107373
123+
[3]: https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf
124+
[IDA FLIRT Documentation]: https://docs.hex-rays.com/user-guide/signatures/flirt/ida-f.l.i.r.t.-technology-in-depth
125+
[Binary Ninja]: https://binary.ninja
126+
[Binary Ninja Integration]: https://github.com/Vector35/binaryninja-api/tree/dev/plugins/warp

0 commit comments

Comments
 (0)