Memory Usage During Ingestion - Can Be Cut In Half #386
johnbrisbin
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have been doing some extended testing of the ingestion phase of privateGPT (where you load up your documents). I have been using a collection of ~1500 epub books which are on average about 1MB each, about 1.57GB in total as Windows measures it.
In the standard configuration, toward the end of the process the memory usage while the DB is loaded and during ingestion is a bit over 20GB. When the data is written to disk, it jumps to over 30GB. 64GB seemed like a lot when I built this machine, less now. The final size of the DB is about 11GB.
In the alternate configuration, when the DB is loaded and during ingestion the usage is right at 10GB which jumps ~50% to 15GB when persisting to disk. The final size of the DB is about 5.5 GB
On this machine, books are ingested at about 100/hr in either configuration.
The difference in the two configurations is that in the standard configuration documents are chunked into pieces of max 500 bytes with 50 bytes overlap, while in the alternate config, the documents are chunked into pieces of 1000 bytes with 100 bytes overlap.
This reduces the number of embeddings by a bit more than 1/2 and the vectors of numbers for each embedded chunk are the bulk of the space used. The 'a bit more' is because larger chunks are slightly more efficient than the smaller ones. Nominal 500 byte chunks average a little under 400 bytes, while nominal 1000 byte chunks run a bit over 800 bytes on average.
The size of the embeddings effects more than just the size of the database in memory or on disk. It has effects on query operations as well (and hopefully you spend more time querying than ingesting in the long term).
Theoretical pro and cons:
Pros -
Cons-
On the balance, cutting the memory requirement in half by doubling the chunk size looks like a win to me.
What do you think? Are there other downsides to increasing the chunk size?
Beta Was this translation helpful? Give feedback.
All reactions