run_jetson.sh error #1706

AdsonNAlves · 2023-03-10T13:55:02Z

Describe the bug

when testing the jetson nano example(https://github.com/adap/flower/tree/main/examples/embedded_devices) and run the
$ ./run_jetson.sh --server_address=<SERVER_ADDRESS> --cid=0 --model=ResNet18. I check the error.

Traceback (most recent call last):
File "./client.py", line 27, in
from flwr.common import (
ImportError: cannot import name 'NDArrays'

I added the flwr path:
/usr/local/lib/python3.6/dist-packages/flwr
/usr/local/lib/python3.6/dist-packages/flwr/common

Some help?

Steps/Code to Reproduce

./run_jetson.sh --server_address=192.168.55.100 --cid=0 --model=ResNet18

Expected Results

Actual Results

=> => naming to docker.io/library/flower_client:latest 0.0s
Traceback (most recent call last):
File "./client.py", line 27, in
from flwr.common import (
ImportError: cannot import name 'NDArrays'

Sanya000 · 2023-03-16T22:07:34Z

This problem also appears in the raspberry pi part of that same example.
When running: ./run_pi.sh --server_address=<xxx.xxx.xx.xxx> --cid=0 --model=Net
The docker builds fine right up to the end and then also fails at:
File "./client.py", line 27, in
from flwr.common import (
ImportError: cannot import name 'NDArrays'

matteuscruz · 2023-03-22T22:26:09Z

I am currently facing the same problem to implement federated learning through flower on my Jetson Nano and Raspberry Pi. Unfortunately I have not found any solution to this problem yet.

The error message is as follows:

Traceback (most recent call last):
File "./client.py", line 35, in
import utils
File "/app/utils.py", line 46, in
class Net(nn.Module):
File "/app/utils.py", line 69, in Net
def get_weights(self) -> fl.common.NDArrays:
AttributeError: module 'flwr.common' has no attribute 'NDArrays'

cleong110 · 2023-05-25T16:18:41Z

I've been running into the same issue here: the root cause seems to be this line:

flower/examples/embedded_devices/Dockerfile

Line 26 in fbf8dd7

RUN pip3 install flwr>=1.0.0

This pip install command lacks quotation marks, so it installs an old version of flwr, and creates a file named "=1.0.0"

The proper command should be

pip3 install "flwr>=1.0.0 "

cleong110 · 2023-05-25T16:23:55Z

This can be checked by running the image interactively, like so:

docker run -it --runtime nvidia --rm --entrypoint bash flower_client

cleong110 · 2023-05-25T16:26:17Z

So we're logged into the docker container itself, and we can now also check what versions of what things got installed:

root@924f9e620cca:/app# pip3 list
Package            Version
------------------ ---------------
appdirs            1.4.4
beautifulsoup4     4.12.2
Cython             0.29.21
dataclasses        0.6
decorator          4.4.2
flwr               0.18.0
future             0.18.2
google             2.0.3
grpcio             1.43.0
importlib-metadata 1.7.0
Mako               1.1.3
MarkupSafe         1.1.1
numpy              1.19.4
Pillow             8.0.1
pip                21.3.1
protobuf           3.19.6
pycuda             2020.1
pytools            2020.4.3
setuptools         51.0.0
six                1.15.0
soupsieve          2.3.2.post1
torch              1.6.0
torchaudio         0.6.0a0+f17ae39
torchvision        0.7.0a0+78ed10c
wheel              0.36.1
zipp               3.6.0

cleong110 · 2023-05-25T16:30:47Z

Which brings us to the next problem: it's not possible to install the latest version of flwr because our python version is 3.6.9

cleong110 · 2023-05-25T16:48:11Z

One workaround is to edit the Dockerfile to install Python3.7 instead, and then either use python3.7 and python3.7 -m pip as your commands from then on, or find a way to set 3.7 as the default over 3.6

cleong110 · 2023-05-25T16:51:18Z

Something like this, added before the pip update line:

RUN apt-get update && apt-get upgrade -y
RUN apt-get install python3.7 -y
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 2
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.6 1
RUN update-alternatives --config python3

# update pip
RUN pip3 install --upgrade pip

cleong110 · 2023-05-25T16:51:36Z

(based on this: https://unix.stackexchange.com/questions/496461/how-to-make-python3-7-default)

cleong110 · 2023-05-25T16:52:16Z

Now I get this error:

ModuleNotFoundError: No module named 'torch'

cleong110 · 2023-05-25T16:52:49Z

This is because installing flwr doesn't install pytorch. pytorch was installed for 3.6, but not 3.7

cleong110 · 2023-05-25T16:56:11Z

So torchvison and torch must be pip installed as well

cleong110 · 2023-05-25T19:17:32Z

But then I'm not sure the new version of pytorch has GPU access

rmouram · 2023-06-26T14:35:32Z

cleong110 I followed your recommendations to fix this error and I come across the following:

ModuleNotFoundError: No module named 'torch'

I installed torch via pip by placing the installation in a RUN command inside the Dockerfile:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

however it gave an error of not finding the PIL module, so I installed the PIL via pip3 inside the RUN in the Dockerfile, and the following error appeared:
ImportError: cannot import name 'ParametersRes' from 'flwr.common' (/usr/local/lib/python3.7/dist-packages/flwr/common/init.py)

Is there any solution? Thanks.

cleong110 · 2023-06-26T14:39:41Z

You are starting down a rabbit hole, I'm afraid. In your case, my suspicion is that pip3 isn't the "right" pip, and is installing torch for the wrong python.

(I did confirm that you can apt install python3.8 We eventually found someone who had built a GPU wheel of Pytorch which was compatible with Python 3.8 somewhere, but it's not trivial)

One helpful tip for debugging is if you login interactively to the Docker by using something like docker run -it --rm --runtime nvidia --entrypoint bash flower_client:latest, I think that's roughly the right command, note how I override the entrypoint so that it doesn't run client.py, but lets me run bash instead.

Then you can manually start a Python shell and try to import torch and see what's wrong.

Some things to watch out for:

are you using pip or pip3?
Are you using python or python3? Check things like python --version or python3 --version
If you used apt to install python3.8, do you need to call it specifically, with, like, python3.8 <some command>?

In your case I would do something like:

# edit the Dockerfile
# rebuild the image, I think run_jetson.sh calls a docker build command
docker run -it --rm --runtime nvidia --entrypoint bash flower_client:latest 

#once you're IN the docker container...
python --version
python3 --version
python3.8 --version
pip --version
pip3 --version

# run python
python3.8 
>> import torch
# etc

and make sure what's installed where.

Probably the best thing would be to, instead of using pip3, instead do:

python3.8 -m pip install <whatever>

which guarantees you are using the specific pip which installs packages that python 3.8 can use

cleong110 · 2023-06-26T14:46:22Z

Regarding ParametersRes: that is actually not your fault at all, I believe it is a compatibility issue between versions of flwr

cleong110 · 2023-06-26T14:47:19Z

#1214 in one of the updates, the function was renamed.

cleong110 · 2023-06-26T14:49:59Z

So the embedded example seems to use the old syntax.

Your options:

rewrite/update the embedded example to use modern flwr syntax
install a version of flwr that works for the embedded example (from before Rename protobuf messages #1214 was merged)
adapt the code from one of the updated examples and run that in the Docker instead (this is what I've been working on)

(Edit: another option is to buy one of the newer Jetsons, that can install a newer Jetpack, that can support newer versions of Pytorch and Flower)

WilliamLindskog · 2024-12-05T22:10:07Z

Hi @cleong110

Thanks for your comments on this issue. Is this a problem that you are still experiencing or can we go ahead and close this issue? Have you tried the new example: https://github.com/adap/flower/tree/main/examples/embedded-devices

cleong110 · 2024-12-12T14:36:13Z

Ah, sorry, haven't looked at this in quite a while. Haven't tried the new example, no.

cleong110 · 2024-12-12T14:47:24Z

@WilliamLindskog the new example does look cool. I see it's designed for a Raspberry Pi. We were using various older Jetson modules, have you tested installing on there?

Other things to possibly check before closing this issue:

have the problems with the Dockerfiles been fixed, e.g. the lack of quotation marks in the pip install? (I see the file is removed in the latest version)
what Jetson devices/Jetpacks/ etc has run_jetson.sh been tested with, or are you planning to support? I think an answer like "Nothing older than a Xavier" might be the easiest answer

Given that it seems the latest example doesn't use Jetsons and older versions especially can be a bit of a pain, perhaps the answers to both of these could just be "N/A, None", and the issue can be closed.

Certainly it would be nice to have an example running on TX2s or Xaviers or Nanos or something, those are exactly the sort of device that it would be really interesting to network together and run Flower on.

WilliamLindskog · 2025-03-12T17:22:07Z

@cleong110 sorry for late reply. Checked out this conversation and seems like it is possible to run "Jetson Orin NX device". #4399

regarding the quotation marks issue, I think that has been resolved in newer versions of Flower. https://github.com/adap/flower/tree/main/examples/embedded-devices

You're right that it would be nice to have instructions on how to run on these sort of devices, would you be interested in creating a PR for this? Just like a .md file on how to run on discussed devices?

cleong110 · 2025-03-12T17:33:35Z

I can't do a PR at this time, sorry. Good to hear that those issues have been solved and people have gotten it working! But yes, a .md might be the easiest!

Ideally it'd have: (1) install/setup instructions (2) instructions to run a very basic FL experiment on two or more devices!

AdsonNAlves added the bug label Mar 10, 2023

WilliamLindskog added state: stale labels Dec 11, 2024

WilliamLindskog removed the part: misc framework label Apr 10, 2025

danieljanes removed the bug label Apr 16, 2025

WilliamLindskog removed the stale label Apr 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run_jetson.sh error #1706

run_jetson.sh error #1706

AdsonNAlves commented Mar 10, 2023 •

edited

Loading

Sanya000 commented Mar 16, 2023

matteuscruz commented Mar 22, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023 •

edited

Loading

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

rmouram commented Jun 26, 2023

cleong110 commented Jun 26, 2023

cleong110 commented Jun 26, 2023

cleong110 commented Jun 26, 2023

cleong110 commented Jun 26, 2023 •

edited

Loading

WilliamLindskog commented Dec 5, 2024

cleong110 commented Dec 12, 2024

cleong110 commented Dec 12, 2024

WilliamLindskog commented Mar 12, 2025 •

edited

Loading

cleong110 commented Mar 12, 2025

run_jetson.sh error #1706

run_jetson.sh error #1706

Comments

AdsonNAlves commented Mar 10, 2023 • edited Loading

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Sanya000 commented Mar 16, 2023

matteuscruz commented Mar 22, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023 • edited Loading

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

cleong110 commented May 25, 2023

rmouram commented Jun 26, 2023

cleong110 I followed your recommendations to fix this error and I come across the following:

cleong110 commented Jun 26, 2023

cleong110 commented Jun 26, 2023

cleong110 commented Jun 26, 2023

cleong110 commented Jun 26, 2023 • edited Loading

WilliamLindskog commented Dec 5, 2024

cleong110 commented Dec 12, 2024

cleong110 commented Dec 12, 2024

WilliamLindskog commented Mar 12, 2025 • edited Loading

cleong110 commented Mar 12, 2025

AdsonNAlves commented Mar 10, 2023 •

edited

Loading

cleong110 commented May 25, 2023 •

edited

Loading

cleong110 commented Jun 26, 2023 •

edited

Loading

WilliamLindskog commented Mar 12, 2025 •

edited

Loading