Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heap streams return objects instead of raw bytes or strings #86

Merged
merged 6 commits into from
Mar 23, 2024

Conversation

malwarefrank
Copy link
Owner

@malwarefrank malwarefrank commented Mar 17, 2024

This adds support for easily identifying the RVA of any heap item, including Strings, Guids, and UserStrings. It does so by creating HeapItem classes that contain the raw bytes, the decoded values, and the rva where the raw bytes were found in the underlying PE.

The .get() member of each of these heap streams now returns these objects. This is a potential BREAKING CHANGE, however the HeapItem subclasses can be treated similarly to what they represent. The original .get() functions have been renamed to .get_bytes() or .get_str() where before they returned bytes or str.

HeapItemString and UserString can be indexed like and compared to strings. HeapItemBinary can be indexed like and compared to bytes objects.

Closes #85

@malwarefrank
Copy link
Owner Author

@c3rb3ru5d3d53c @williballenthin welcome testing or comments if you have some time before I merge this in

@williballenthin
Copy link
Contributor

From afar, these changes look reasonable and a nice improvement to the API. I wouldn't ask that any changes be made before merging.

(Opinion follows:) Personally, I'm not super keen on:

however the HeapItem subclasses can be treated similarly to what they represent.

This feels clever, and I'm not sure that's a good thing. I guess it's a trade off to keep client code a bit more concise. Will it so easy that users get themselves in trouble because the types aren't as they seem? shrug

Like I said, the changes look great, and although I volunteered an opinion, I don't think you should make any modifications unless you strongly agree.

@malwarefrank
Copy link
Owner Author

I agree. I originally thought it would be convenient, but decided to remove the str and bytes compatibility. I left the eq because it really is convenient and doesn't overly complicate, I think. We will see :)

@malwarefrank malwarefrank merged commit 096de1b into master Mar 23, 2024
4 checks passed
@malwarefrank malwarefrank deleted the heap_items branch March 23, 2024 04:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

more easily identify the location of heap stream items in the underlying PE
2 participants