Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValidationError: Input validation error: inputs must have less than 4096 tokens. Given: 4545 #1103

Open
asma-10 opened this issue Apr 18, 2024 · 0 comments

Comments

@asma-10
Copy link

asma-10 commented Apr 18, 2024

Describe the bug

i was using meta-llama/Llama-2-7b-chat-hf from hugging face in a RAG model and it used to work perfectly, bur then i suddenly recieved this error :

HfHubHTTPError: 422 Client Error: Unprocessable Entity for url: https://api-inference.huggingface.co/models/meta-llama/Llama-2-7b-chat-hf (Request ID: gPxf6Ns0plH9zveHLZP_A)

Input validation error: `inputs` must have less than 4096 tokens. Given: 4545
Make sure 'text-generation' task is supported by the model.

this is the code i used :

llm = HuggingFaceInferenceAPI(model_name="meta-llama/Llama-2-7b-chat-hf", api_key=hf_token)
rerank = SentenceTransformerRerank(
    model="BAAI/bge-reranker-v2-m3", top_n=4
)
bm25_retriever = BM25Retriever.from_defaults(index=index, similarity_top_k=10)
query_engine = RetrieverQueryEngine.from_args(
    retriever=bm25_retriever,
    llm=llm,
    node_postprocessors=[rerank]
)

Runtime Environment

  • Model: llama-2-7b-chat-hf, llama-2-7b-hf
  • Using via huggingface?: yes
  • OS: Windows
  • GPU VRAM: colab's GPU
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant