Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instruction translating for RIP-relative addresses. #64

Open
EthanZoneCoding opened this issue Feb 19, 2024 · 11 comments
Open

Instruction translating for RIP-relative addresses. #64

EthanZoneCoding opened this issue Feb 19, 2024 · 11 comments
Assignees

Comments

@EthanZoneCoding
Copy link

I think functions with RIP-relative instructions at the start are common enough to justify such a feature. I did a bunch of research, and it looks like PolyHook was able to do it by translating the instructions into equivalent ones with absolute addressing.
stevemk14ebr/PolyHook_2_0#119
I tried making some changes in the inline hook creator, specifically ff_hook, but I can't quite figure it out. I've gotten fairly familiar with the library but this assembly stuff is still a challenge I'm trying to learn from and get through. Is this something that could be implemented?
Thanks.

@cursey
Copy link
Owner

cursey commented Feb 19, 2024

I think functions with RIP-relative instructions at the start are common enough to justify such a feature.

The issue isn't as common as this statement makes it out to be. SafetyHook will already relocate RIP relative instructions as long as it can allocate space for a trampoline nearby (+-2gb). This is usually the case. So in reality the issue is functions that have RIP relative instructions at the start and we can't find space to allocate a trampoline nearby.

I did a bunch of research, and it looks like PolyHook was able to do it by translating the instructions into equivalent ones with absolute addressing.
stevemk14ebr/PolyHook_2_0#119

It's funny because I originally started SafetyHook as just a simpler alternative to the library you mentioned. It's impressive that they support translation of RIP relative instructions, but they lean on asmtk to do so. I do not want SafetyHook to have any additional dependencies so if any work is done towards this goal it should not introduce an additional dependency.

Is this something that could be implemented?

I think it could be. The tricky part is to keep it simple while not introducing any additional dependencies. Zydis, the library SafetyHook uses for instruction decoding also has the ability to encode instructions. This could probably do a lot of the heavy lifting, but isn't something I've really looked into yet.

Do you have a concrete example of where SafetyHook is failing and where mid-hooking the function in question is not an option?

@praydog
Copy link
Collaborator

praydog commented Feb 19, 2024

Yes, assemblers can widen some subsets of instructions to larger versions. Though some instructions will require some variant of:

push reg
movabs reg, address
cmp reg2, [reg]
pop reg

instead of:

cmp reg2, [rip+12345]

I believe Zydis has an instruction encoder which may be of use here.

@EthanZoneCoding
Copy link
Author

Do you have a concrete example of where SafetyHook is failing and where mid-hooking the function in question is not an option?

I tried with a mid hook just now, it seems to work most of the time. I say most because I still run into #52 randomly both with freezing and thread trapping. I think it might be a problem with my application. I'll keep looking to see if I can get mid-hooks to be stable.

@cursey cursey self-assigned this Feb 25, 2024
@cursey
Copy link
Owner

cursey commented Feb 25, 2024

I've done some initial testing for re-encoding IP relative instructions to be absolute that seems promising. Expect a PR soon for testing.

@bottiger1
Copy link
Contributor

bottiger1 commented May 14, 2024

I personally haven't run into this issue yet, but I looked at zydis and you can only assemble 5 instructions max which seems quite limiting.

https://github.com/zyantific/zydis/blob/fd3e9a6cc8bdcc617b531feda186699e51664f76/include/Zydis/Encoder.h#L57

@angelfor3v3r
Copy link
Contributor

angelfor3v3r commented May 14, 2024

I personally haven't run into this issue yet, but I looked at zydis and you can only assemble 5 instructions max which seems quite limiting.

https://github.com/zyantific/zydis/blob/fd3e9a6cc8bdcc617b531feda186699e51664f76/include/Zydis/Encoder.h#L57

That's the number of operands in a single instruction, not the number of instructions.

We could encode and extend these rip relative instructions with ease, I'm fairly certain. You can make multiple encoding requests: https://github.com/zyantific/zydis/blob/fd3e9a6cc8bdcc617b531feda186699e51664f76/examples/EncodeMov.c#L42-L57

@bottiger1
Copy link
Contributor

Oh right, never mind then. This seems more clunky than asmjit but I suppose it would work.

@angelfor3v3r
Copy link
Contributor

Oh right, never mind then. This seems more clunky than asmjit but I suppose it would work.

It's different, but introducing more dependencies is definitely something the library wants to avoid. I guess we're lucky Zydis has encoding now.

@cursey
Copy link
Owner

cursey commented May 20, 2024

Anyone have any ideas on what is causing this or how it might be fixed?

There may be a bug in the Linux version of vm_query.

std::expected<VmBasicInfo, OsError> vm_query(uint8_t* address) {
auto* maps = fopen("/proc/self/maps", "r");
if (maps == nullptr) {
return std::unexpected{OsError::FAILED_TO_QUERY};
}
char line[512];
unsigned long start;
unsigned long end;
char perms[5];
unsigned long offset;
int dev_major;
int dev_minor;
unsigned long inode;
char path[256];
unsigned long last_end =
reinterpret_cast<unsigned long>(system_info().min_address); // Track the end address of the last mapping.
auto addr = reinterpret_cast<unsigned long>(address);
std::optional<VmBasicInfo> info = std::nullopt;
while (fgets(line, sizeof(line), maps) != nullptr) {
path[0] = '\0';
sscanf(line, "%lx-%lx %4s %lx %x:%x %lu %255[^\n]", &start, &end, perms, &offset, &dev_major, &dev_minor,
&inode, path);
if (last_end < start && addr >= last_end && addr < start) {
info = {
.address = reinterpret_cast<uint8_t*>(last_end),
.size = start - last_end,
.access = VmAccess{},
.is_free = true,
};
break;
}
last_end = end;
if (addr >= start && addr < end) {
info = {
.address = reinterpret_cast<uint8_t*>(start),
.size = end - start,
.access = VmAccess{},
.is_free = false,
};
if (perms[0] == 'r') {
info->access.read = true;
}
if (perms[1] == 'w') {
info->access.write = true;
}
if (perms[2] == 'x') {
info->access.execute = true;
}
break;
}
}
fclose(maps);
if (!info.has_value()) {
return std::unexpected{OsError::FAILED_TO_QUERY};
}
return info.value();
}

Linux support was added recently and is less thoroughly tested than the other supported platforms, especially 32-bit Linux. A minimal stand-alone example that reproduces the bug would be helpful.

It's likely unrelated to this issue since you mentioned this is a 32-bit problem.

@angelfor3v3r
Copy link
Contributor

well I looked into vm_query as you suggested and it seems like it is returning nullopt unless I do this

info = VmBasicInfo{};

I don't know if this is caused by another contributor setting the std level to c++17 or if it was always broken, I'm testing it now.

Also maybe it might be good to abort the hook if vm_protect fails like this in the trap_threads function. If it did so then I wouldn't have spent so long debugging this.

Sorry for going off topic on this thread, I'll delete this stuff once I figure it out.

Update

Yeah it appears to be caused by setting std level to C++17.

I believe primary support was for modern C++(23), anything else hasn't been tested much by anyone. There are a lot of differences as you go down.

@bottiger1
Copy link
Contributor

bottiger1 commented May 20, 2024

well I looked into vm_query as you suggested and it seems like it is returning nullopt unless I do this
info = VmBasicInfo{};
I don't know if this is caused by another contributor setting the std level to c++17 or if it was always broken, I'm testing it now.
Also maybe it might be good to abort the hook if vm_protect fails like this in the trap_threads function. If it did so then I wouldn't have spent so long debugging this.
Sorry for going off topic on this thread, I'll delete this stuff once I figure it out.
Update
Yeah it appears to be caused by setting std level to C++17.

I believe primary support was for modern C++(23), anything else hasn't been tested much by anyone. There are a lot of differences as you go down.

yeah It wasn't my idea to downport it that much, I wanted to keep the library as is so we could keep it updated with the upstream with minimal hassle but not much I can do since it's not my project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants