-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama running in Codon. Needed Codon file/struct support. #490
Comments
Hi @dmahurin -- thanks for sharing this! It seems like the best long-term solution would simply be to support the s = '\xb0\xb1\xb2\xb3'
print(int(Ptr[i32](s.ptr)[0])) # equivalent to struct.unpack('<i', b'\xb0\xb1\xb2\xb3') This should be a lot more efficient than going through CPython, until we add proper support for |
Thanks @arshajii, the llama2.py/codon change is now updated to use a trivial struct.codon from your advice. https://github.com/dmahurin/llama2.codon/blob/codon/struct.codon An incomplete implementation to say the least, but enough. |
Llama code now runs on Codon. (with a 74 X improvement compared to Python).
https://github.com/dmahurin/llama2.codon/
tairov/llama2.py#5
But, to get this to work changes/hacks were needed to support using 'struct' to read types from a file.
(the Codon workarounds slow down model and token loading)
See:
dmahurin/llama2.codon@1e2e7fa
Currently with Codon, file.read returns strings instead of bytes. Bytes are required for python struct.
One workaround is implementing a new file interface using file descriptors in python
https://github.com/dmahurin/llama2.codon/blob/codon/fdfile.py
The other ugly workaround was to pack the bytes with '4B', to get 4 bytes at a time (inefficiently) then unpack with 'i' or 'f'.
The best solution needs a fix from Codon.
I think perhaps the smallest Codon fix would be to return bytes (b'...') if the file is opened binary. returning string otherwise.
Below shows the issue:
The text was updated successfully, but these errors were encountered: