You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I'm having an issue with the amazon-textract-textractor library. It doesn't detect theINVOICE_RECEIPT_ID, but the AWS Textract Demo can detect it.
Here is the AWS Textract Demo:
amazon-textract-textractor:
Here is the sample code:from textractor import Textractor
I am not sure what the backend implementation is on the textract demo but I have personally noticed that textract async calls produce superior results than the sync equivalent.
Given that your input is just an image / one paged doc. It can be very tempting to call the sync api extractor.analyze_expense() because its quicker and has less overhead. Try using the async extractor.start_expense_analysis() instead and compare your results.
Hello, I'm having an issue with the amazon-textract-textractor library. It doesn't detect theINVOICE_RECEIPT_ID, but the AWS Textract Demo can detect it.
Here is the AWS Textract Demo:
amazon-textract-textractor:
Here is the sample code:from textractor import Textractor
extractor = Textractor(profile_name="")
document = extractor.analyze_expense(
file_source="test.jpg",
save_image=False,
)
expense_doc = document.expense_documents[0]
summary_fields = expense_doc.summary_fields
line_field = expense_doc.line_items_groups
print(summary_fields)
Sample Receipt:
The text was updated successfully, but these errors were encountered: