Microsoft Excel ingestion #1412
Replies: 3 comments 2 replies
-
Any answer? |
Beta Was this translation helpful? Give feedback.
-
I have same problem with docx. Maybe some Microsoft garbage? I modified py script which load docs. If end with error, make line into log and continue with other doc. |
Beta Was this translation helpful? Give feedback.
-
The error you're seeing suggests that you are decoding or processing a file with the wrong character encoding. In the context of ingesting Microsoft Excel (*.xlsx) files, this error occurs if you’re trying to read the file as if it were a plain text file or using an incorrect method that implicitly assumes a text encoding. Excel files (.xlsx) are actually a collection of XML files compressed into a single ZIP package. Therefore, trying to read them directly as UTF-8 encoded text files will lead to errors like the one you’re seeing. |
Beta Was this translation helpful? Give feedback.
-
When I try to ingest Microsoft Excel (*.xlsx) files, I get this error message:
'utf-8' codec can't decode bytes in position 15-16: invalid continuation byte
The files are in English language, just standard latin characters.
Is there any way around this, maybe a different document loader?
Beta Was this translation helpful? Give feedback.
All reactions