Pillow doesn't support WMF/EMF format in Linux #153

zengyulu · 2024-02-12T23:13:47Z

zengyulu
Feb 12, 2024

Looks like RapidOCR uses Pillow to load image. There is a problem with WMF/EMF format which will be used / embedded sometimes in MS Powerpoint and Word documents. When using RapidOCR to scan and recognize these office documents, it will throw out exception that "cannot find loader for this WMF file".

Accoding to Pillow documentation, the full support for WMF/EMF formats are only available in Windows. I understand that it calls Win32 API to get actual loading of WMF format
python-pillow/Pillow#2971

So maybe we can compile from source to get it supported (haven't tried it). But in general, can we use different image loader here? Such as OpenCV, matplotlib, etc...

Thanks

SWHL · 2024-02-18T01:46:37Z

SWHL
Feb 18, 2024
Maintainer

Supplymentary material

Environment:

OS: macOS
Python: 3.10
rapidocr_onnxruntime: 1.3.11

Code:

from rapidocr_onnxruntime import RapidOCR

engine = RapidOCR()

image_path = "1.wmf"
with open(image_path, "rb") as f:
    img = f.read()

result, elapse_list = engine(img)
print(result)
print(elapse_list)

WMF Image:

1.wmf.zip

Error:

(xxxx) ➜  python git:(main) ✗ python demo.py                                                                                 
Traceback (most recent call last):
  File "/Users/xxxx/projects/RapidOCR/python/demo.py", line 19, in <module>
    result, elapse_list = engine(img)
  File "/Users/xxxx/projects/RapidOCR/python/rapidocr_onnxruntime/main.py", line 79, in __call__
    img = self.load_img(img_content)
  File "/Users/xxxx/projects/RapidOCR/python/rapidocr_onnxruntime/utils.py", line 125, in __call__
    img = self.load_img(img)
  File "/Users/xxxx/projects/RapidOCR/python/rapidocr_onnxruntime/utils.py", line 139, in load_img
    img = np.array(Image.open(BytesIO(img)))
  File "/Users/xxxx/miniconda3/envs/xxxx/lib/python3.10/site-packages/PIL/Image.py", line 696, in __array_interface__
    new["data"] = self.tobytes()
  File "/Users/xxxx/miniconda3/envs/xxxx/lib/python3.10/site-packages/PIL/Image.py", line 754, in tobytes
    self.load()
  File "/Users/xxxx/miniconda3/envs/xxxx/lib/python3.10/site-packages/PIL/WmfImagePlugin.py", line 160, in load
    return super().load()
  File "/Users/xxxx/miniconda3/envs/xxxx/lib/python3.10/site-packages/PIL/ImageFile.py", line 344, in load
    raise OSError(msg)
OSError: cannot find loader for this WMF file

0 replies

SWHL · 2024-02-18T02:10:43Z

SWHL
Feb 18, 2024
Maintainer

On macOS and Linux, I have temporarily found a better way to read wmf format image files.
If you have a better one, welcome to submit PR.

2 replies

igordevezas Nov 12, 2024

which way?

SWHL Nov 12, 2024
Maintainer

This should be a typographical error, as we haven't found a better solution yet.

zengyulu · 2024-02-18T12:09:33Z

zengyulu
Feb 18, 2024
Author

I think currently best way to do is to convert unspported format to pdf files, which is also easier for parsing and embedding.

0 replies

SWHL · 2024-02-19T00:31:13Z

SWHL
Feb 19, 2024
Maintainer

I searched and found no suitable tool for converting WMF format images to PDF.
This method does not seem to be a simple one.

0 replies

zengyulu · 2024-02-19T09:37:20Z

zengyulu
Feb 19, 2024
Author

can install LibreOffice for converting or loading WMF/EMF format. See below
https://dev.blog.documentfoundation.org/2022/04/26/supporting-metafile-formats-wmf-emf-emfplus/

2 replies

SWHL Feb 20, 2024
Maintainer

I thought about it for a long time and thought it would be inappropriate to put the code for processing WMF format images in the RapidOCR repository.

If you encounter a situation where you need to process WMF format images, please solve it yourself first. If necessary, I will consider whether to add WMF format image processing code here.

zengyulu Feb 28, 2024
Author

Thanks for reply. I will process MS docs via Libreoffice first.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pillow doesn't support WMF/EMF format in Linux #153

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Pillow doesn't support WMF/EMF format in Linux #153

zengyulu Feb 12, 2024

Replies: 5 comments · 4 replies

SWHL Feb 18, 2024 Maintainer

Supplymentary material

Environment:

Code:

WMF Image:

Error:

SWHL Feb 18, 2024 Maintainer

igordevezas Nov 12, 2024

SWHL Nov 12, 2024 Maintainer

zengyulu Feb 18, 2024 Author

SWHL Feb 19, 2024 Maintainer

zengyulu Feb 19, 2024 Author

SWHL Feb 20, 2024 Maintainer

zengyulu Feb 28, 2024 Author

zengyulu
Feb 12, 2024

Replies: 5 comments 4 replies

SWHL
Feb 18, 2024
Maintainer

SWHL
Feb 18, 2024
Maintainer

SWHL Nov 12, 2024
Maintainer

zengyulu
Feb 18, 2024
Author

SWHL
Feb 19, 2024
Maintainer

zengyulu
Feb 19, 2024
Author

SWHL Feb 20, 2024
Maintainer

zengyulu Feb 28, 2024
Author