Upgrade fmeval in prompt app #8

athewsey · 2024-10-29T07:46:23Z

Issue #, if available: N/A

Description of changes:

Basic upgrade of fmeval v1.0->v1.2

Outstanding issues:

(As per Option to disable BERTScore in QAAccuracy aws/fmeval#330), fmeval 1.2 introduced BERTScore which is slow to calculate: Running the qa_accuracy eval on our 20-question example dataset goes from ~30sec to >3min with this change, making the feel of the workflow much less interactive.
- Maybe upstream could provide a configuration so we could turn this score off in our app?
- Maybe we could refactor our app to support continuing to work asynchronously?
- Bumping up the compute of the ECS app could calculate results faster, but increase solution cost?
Newer fmeval should support multi-variable prompt templates, so our app's logic could probably be simplified to rely more on this functionality. The current change still just uses $prompt and does fulfilment of other variables within our custom code.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

chore(prompt): Upgrade fmeval to v1.2.1

aeef8fc

Provide feedback