Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker on macOS arm64 #1089

Open
jgoodall opened this issue Mar 5, 2024 · 5 comments
Open

Docker on macOS arm64 #1089

jgoodall opened this issue Mar 5, 2024 · 5 comments
Labels
docker macOS-specific Issue visible only on macOS environments need help Issues where the contributors are even more incompetent than usual
Milestone

Comments

@jgoodall
Copy link

jgoodall commented Mar 5, 2024

I cannot run the full docker image on an arm64 Mac (environment details below). I can run docker run --init -p 8070:8070 lfoppiano/grobid:0.8.0 and that works, but if I try the full image, it immediately quits (same result with tag 0.8.1-SNAPSHOT):

docker run --init -p 8070:8070 grobid/grobid:0.8.0
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
WARN  [2024-03-05 16:57:33,076] org.hibernate.validator.internal.properties.javabean.JavaBeanExecutable: HV000254: Missing parameter metadata for ResponseMeteredLevel(String, int), which declares implicit or synthetic parameters. Automatic resolution of generic type information for method parameters may yield incorrect results if multiple parameters have the same erasure. To solve this, compile your code with the '-parameters' flag.
The TensorFlow library was compiled to use AVX instructions, but these aren't available on your machine.

I am using Orbstack, and if I uncheck Use Rosetta to run Intel code, instead of the AVX instructions error, I get SSE4.1 instructions. My understanding based on the docs and the issues is that the deep learning image should work on CPUs on macOS using emulation, but I dont see instructions for doing that. I see a reference to libwapiti.dylib, but not how to get that or what to do with it.

Would it be possible to provide detailed instructions for somehow getting the grobid/grobid docker image working on a mac?

Related issues:

Environment details

macOS is 14.3.1 with an M2 chip. OrbStack 1.4.3.

Docker Info:

Client:
 Version:           25.0.3
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        4debf41
 Built:             Tue Feb  6 21:13:26 2024
 OS/Arch:           darwin/arm64
 Context:           orbstack

Server: Docker Engine - Community
 Engine:
  Version:          25.0.3
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.21.6
  Git commit:       f417435
  Built:            Tue Feb  6 21:14:35 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          v1.7.13
  GitCommit:        7c3aca7a610df76212171d200ca3811ff6096eb8
 runc:
  Version:          1.1.12
  GitCommit:        51d5e94601ceffbbd85688df1c928ecccbfa4685
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
@lfoppiano lfoppiano added macOS-specific Issue visible only on macOS environments docker labels Mar 6, 2024
@lfoppiano
Copy link
Collaborator

Dear @jgoodall,
thank you for reporting this issue.

Indeed we can run Grobid on M1 (and, I suppose M2, but I did not test it) outside docker (#935, #1002).
However, for docker, the situation is a bit different.
There is a docker image that should work fine on ARM: lfoppiano/docker:0.8.0-arm, however, given the lack of time, there is certainly room for improvement.
This image is available for versions 0.7.3 and 0.8.0, however, they are not documented yet because they were not very stable at the time we tested (#1014).

The use case is a bit weak for me, as I run Grobid on my Mac only for development, and I prefer to run the server on a Linux machine for serious processing.

Maybe give it a try to the 0.8.0-arm and let me know. We can try to support more the arm architecture meanwhile.

@jgoodall
Copy link
Author

jgoodall commented Mar 7, 2024

To be clear, the lfoppiano/grobid image works fine on M2 chip. It is the grobid/grobid that is not working.

That said - since you already have the arm image for the lfoppiano/grobid build, you can create a single multi-arch build of the two images you have with a single manifest so that users would automatically get the right image for their architecture. I tested that and pushed to my namespace, but you would obviously want to do this to the lfoppiano namespace. (I am assuming you build the arm image using --build-arg ARCH=arm64v8, but I am not seeing where in the repo that is being built.) Here is a proof-of-concept of having a single multi-arch build for the CRF image:

VERS=0.8.0

docker pull lfoppiano/grobid:$VERS
docker tag lfoppiano/grobid:$VERS jgoodall/grobid-crf:$VERS-amd64
docker push jgoodall/grobid-crf:$VERS-amd64

docker pull lfoppiano/grobid:$VERS-arm
docker tag lfoppiano/grobid:$VERS-arm jgoodall/grobid-crf:$VERS-arm64v8
docker push jgoodall/grobid-crf:$VERS-arm64v8

docker manifest create jgoodall/grobid-crf:$VERS \
--amend jgoodall/grobid-crf:$VERS-amd64 \
--amend jgoodall/grobid-crf:$VERS-arm64v8
docker manifest push jgoodall/grobid-crf:$VERS

I imagine that will be trickier since the openjdk runtime image you use for the CRF already supports arm, but the tensorflow image you use for the full docker image does not. I would be happy to help, but I am not a Java person, so if someone can create an appropriate arm64 image for grobid/grobid, I can help figure out the docker manifest commands.

We also only run on Mac for development and Linux for production, and would like to be able to use the same image (grobid/grobid) on both. My workaround right now is to use the lfoppiano/grobid on mac and the grobid/grobid on Linux, but would much prefer to use the same for both environments. I just cannot figure out how to do that, so any guidance would be great.

@lfoppiano
Copy link
Collaborator

Thanks.
I'm sorry, I'm a bit short on time to dedicate to grobid at the moment, so I don't know when I will be able to work on this.

Would you be able to submit a PR for this change?

@lfoppiano lfoppiano added the need help Issues where the contributors are even more incompetent than usual label Apr 12, 2024
@rgranit
Copy link

rgranit commented Aug 23, 2024

Same issue here.. would be great to have a single image solution for mac and linux

@lfoppiano
Copy link
Collaborator

Please check #1165, there is an double-arch image. I would appreciate if you could give it a try, with particular attention to the issue with the pdfalto processes.

@lfoppiano lfoppiano added this to the 0.8.2 milestone Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docker macOS-specific Issue visible only on macOS environments need help Issues where the contributors are even more incompetent than usual
Projects
None yet
Development

No branches or pull requests

3 participants