Skip to content

Conversation

@athewsey
Copy link
Collaborator

Issue #, if available: N/A

Description of changes:

Basic upgrade of fmeval v1.0->v1.2

Outstanding issues:

  1. (As per Option to disable BERTScore in QAAccuracy aws/fmeval#330), fmeval 1.2 introduced BERTScore which is slow to calculate: Running the qa_accuracy eval on our 20-question example dataset goes from ~30sec to >3min with this change, making the feel of the workflow much less interactive.
    • Maybe upstream could provide a configuration so we could turn this score off in our app?
    • Maybe we could refactor our app to support continuing to work asynchronously?
    • Bumping up the compute of the ECS app could calculate results faster, but increase solution cost?
  2. Newer fmeval should support multi-variable prompt templates, so our app's logic could probably be simplified to rely more on this functionality. The current change still just uses $prompt and does fulfilment of other variables within our custom code.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant