-
Notifications
You must be signed in to change notification settings - Fork 56
Description
I have been using this for about a week now and I'm loving it (almost) and more people should know about this program. I currently have 2 problems with it.
-
I set a prompt like this: "Please write a story set in modern times, the story should contain 10 chapters of 1000-1500 words in each chapter." Then I add the story details. I have noticed that if you use a single model for all steps then it hits the context limit really quick. (I initially though that none of my models worked as they would all start looping). Then I tried copying the file in Ollama to a new name but it must have known it was the same file as it didn't reload. Is there a way to add reloading of a model in Ollama for each model stage of the config.py? Nothing popped out at me in the Ollama library (I only know very basic python). This would reset the context for the model at each stage and clear up part of the problem.
-
I can also see that Ollama has a context of only 2k for all models but looking at the Ollama python library I saw a reference to num_ctx in the _types.py under
class Options(TypedDict, total=False):
# load time options
num_ctx: int.
I think it may be able to be changed somewhere in your code but I have no idea where it would go. (1000 words should only be around 1400 tokens + overheads but with 2k context it's not going far.)
Thank you
[edit]
just found that num_ctx is already listed in wrapper.py but not implemented. Not sure how to implement it.