Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation #41

Open
mrx23dot opened this issue Jun 12, 2022 · 3 comments
Open

Documentation #41

mrx23dot opened this issue Jun 12, 2022 · 3 comments

Comments

@mrx23dot
Copy link

mrx23dot commented Jun 12, 2022

readme.md doesn't mention

  • what happens when encoding arbitrary byte array (eg. utf8 with transmission errors)
  • will unishox2_decompress_simple still be able to recover original byte array?
  • Can decompress_simple take arbitrary/random input without crashing? (or input needs to be checksum checked)
  • Is it guaranteed that len(compressed data) <= len(original)?
  • API doesn't describe the return values in code.
@siara-cc
Copy link
Owner

siara-cc commented Jun 12, 2022

Replying here for now:

  • what happens when encoding arbitrary byte array (eg. utf8 with transmission errors)
    It will consider them as binary characters and encode them, but compression ratio will reduce.

  • will unishox2_decompress_simple still be able to recover original byte array?
    It will go as per the rule book and the decompressed out may have some unreadable parts

  • Can decompress_simple take arbitrary/random input without crashing? (or input needs to be checksum checked)
    The C Library won't write past the length of the buffer passed, so the input needs to checksum checked only if perfection is needed. The Javascript library would throw a runtime exception and the buffer will have whatever was decoded.

  • Is it guaranteed that len(compressed data) <= len(original)?
    Yes, in most real life scenarios. But if you expect weird sequences in input or garbled input, please allocate extra safety buffer for compression and decompression.

  • API doesn't describe the return values in code.
    I will add them. But for C API, both compress and decompress return the number of bytes. For JS API, the compress functions return number of bytes of compressed output and the decompress function may return the output string or number of bytes of decompressed output based on choice specified in input.

@mrx23dot
Copy link
Author

Thanks for the clarification!
So far it works great.

@siara-cc
Copy link
Owner

  • The C Library won't write past the length of the buffer passed,

This applies only if the library is compiled with -DUNISHOX_API_WITH_OUTPUT_LEN=1 and unishox2_(de)compress() is used with an additional parameter of provided buffer length. Otherwise it is always assumed that the caller provides enough buffer.
This is not the default to make it faster.
I missed mentioning this earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants