Skip to content

Haskell bindings to an extremely limited subset of tiktoken-rs

License

Notifications You must be signed in to change notification settings

iconnect/tiktoken-hs

Repository files navigation

tiktoken.hs

This library is a binding to an extremely (as in, one function) subset of the tiktoken-rs library. It exposes a function countTokens :: Text -> Word64 which can be used to count tokens and return a result which should match the one returned by OpenAI itself (see for example their online tool).

Library design

This library uses the haskell-foreign-rust and haskell-rust-ffi to call into tiktoken-rs which is currently the industry-standard for tokenisation. Internally, this library is really composed by a Rust wrapper and a Haskell library, where the former is shipped alongside the latter, and we use a Custom setup script to seamlessly build the Rust wrapper before building the Haskell library.

For more information see the blog post Calling Purgatory from Heaven.

Building the project

This project requires a nighly version of the Rust toolchain as well as the cargo-c applet. You can install both with:

rustup toolchain install nightly
cargo install cargo-c

Then, you can build this project like any other Haskell library with cabal v2-build.

About

Haskell bindings to an extremely limited subset of tiktoken-rs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published