Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ppo_example.py #59

Closed
xabierolaz opened this issue May 20, 2020 · 27 comments
Closed

ppo_example.py #59

xabierolaz opened this issue May 20, 2020 · 27 comments
Labels
good first issue Good for newcomers

Comments

@xabierolaz
Copy link

Hi, I'm trying to start testing the PPO algorythm in the thesis-work branch.

Are there any steps on how to get started once GymFC is working? I'm using jupyter notebook, is there any documentation to look for regarding it's OpenAI baselines and how it's related to gymfc?
Thanks

@wil3
Copy link
Owner

wil3 commented May 20, 2020

Hey @xabierolaz been meaning to reach out to you, everything is in the master branch including a PID and PPO demo all used in my thesis. Let me know if you have any further questions.

Make sure to remember to build the aircraft plugin s for the NF1 model. I have another branch I'm working on to improve the installation of everything.

@wil3
Copy link
Owner

wil3 commented May 20, 2020

As a follow up, do you have the PPO example running?

@xabierolaz
Copy link
Author

xabierolaz commented May 21, 2020

Not yet, still struggling with some fundamental AI pipeline basics before getting it running, also have to install Tensorflow as it seems in the PPO code it's mandatory

@wil3
Copy link
Owner

wil3 commented May 21, 2020

Before getting into the neuro-controllers which is far more complicated than traditional controllers, make sure you have the pid_example.py running and understand it.

Gymfc_nf (in examples/) is just an OpenAI gym. If you are familiar with OpenAI gyms it's pretty straight forward. Gymfc_nf just provides a wrapper around GymFC, extending the base class to be able to interface with Gazebo. I suggest reading up on the OpenAI project and walking through all their environments and examples.

Also have a look at the book "Reinforcement Learning: An Introduction Second edition" by Sutton and Barto if you are interested in more RL background.

@varunag18
Copy link

Thanks for the guidance Wil. Im stuck at the same point as @xabierolaz is. Now focussing on pid_example first.

@varunag18
Copy link

@xabierolaz have you been able to run the file pid_example? I am facing the error
SystemExit: Timeout communicating with flight control plugin.
Though I am sure i have built the NF1 plugins and used the ln command to put it in correct place.
Screenshot from 2020-05-23 10-15-33

If you notice the top part of the terminal screen screenshot, the plugins have been built in the directory gymfc-aircraft-plugins/build and then linked to the nf1/plugins/build directory.
What am I missing??

@wil3
Copy link
Owner

wil3 commented May 23, 2020

Unless you are installing via pip3 in edit/development mode (-e flag) you could run into problems as out lined in PR #60. Highly recommend using branch i58-dep-script until it is merged into master as it is more stable.

@varunag18 did you resolved Issue #61? If so please provide a solution and close the issue. If you can't run test_start_sim.py you aren't going to be able to run anything else.

How you installed the motor plugins should work but it conflicts with the procedure here so I'm not sure.

Also whenever you are submitting an issue for a command not working please do not screen shot it if its text, text isn't searchable. Please also provide the entire verbose output for the command otherwise there's not enough information to help debug.

@varunag18
Copy link

Hi Wil, Issue #61 isnt resolved yet. Im able to run test_start_sim.py only by adding sudo. I have detailed it in the issue comments. With respect to the error in pid_example.py which I have mentioned above, i am getting it when running below code:

(fc) varun@varun:~/Documents/gymfc/examples$ python3 pid_example.py --verbose
/home/varun/gym/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
Sending motor control signals to port 9726
Gazebo Model Path = /home/varun/local/share/gazebo-10/models:/usr/local/share/gazebo-10/models::/home/varun/Documents/gymfc/gymfc/envs/assets/gazebo/models:/home/varun/Documents/gymfc/examples/gymfc_nf/twins
Gazebo Plugin Path = /home/varun/local/lib/gazebo-10/plugins:/usr/local/lib/gazebo-10/plugins::/home/varun/Documents/gymfc/gymfc/envs/assets/gazebo/plugins/build:/home/varun/Documents/gymfc/examples/gymfc_nf/twins/nf1/plugins/build
Starting gzserver with process ID= 7860
gzserver: error while loading shared libraries: libdart-collision-bulletd.so.6.7: cannot open shared object file: No such file or directory
Timeout communicating with flight control plugin.
Simulation Stats
steps 0
packets_dropped 0
time_start_seconds 1590256960.7300658
time_lapse_hours 0.01641447875234816
/bin/sh: 1: kill: No such process
Killing Gazebo process with ID= 7860
Timeout communicating with flight control plugin.

Also, running sudo python3 pid_example.py gives the below error:

(fc) varun@varun:~/Documents/gymfc/examples$ sudo python3 pid_example.py --verbose
[sudo] password for varun:
/home/varun/gym/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
Sending motor control signals to port 9627
Gazebo Model Path = /home/varun/local/share/gazebo-10/models::/home/varun/Documents/gymfc/gymfc/envs/assets/gazebo/models:/home/varun/Documents/gymfc/examples/gymfc_nf/twins
Gazebo Plugin Path = /home/varun/local/lib/gazebo-10/plugins::/home/varun/Documents/gymfc/gymfc/envs/assets/gazebo/plugins/build:/home/varun/Documents/gymfc/examples/gymfc_nf/twins/nf1/plugins/build
Starting gzserver with process ID= 7891
Traceback (most recent call last):
File "pid_example.py", line 55, in
ob, reward, done, _ = env.step(ac)
File "/home/varun/Documents/gymfc/examples/gymfc_nf/envs/base.py", line 88, in step
self.sample_noise(self))
TypeError: 'NoneType' object is not callable

I havent been able to trace the cause of this TypeError: 'NoneType' object is not callable error so far.

@wil3
Copy link
Owner

wil3 commented May 24, 2020

Hey @varunag18 we still need to resolve the sudo problem because having to run things sudo will cause issues down the line but the TypeError you are reporting may be a valid bug. Let me try reproducing and I'll get back to you. Having a quick look at the source ppo_example.py does set the sample_noise class attribute so if you want to test if that's the problem you can just copy this to below this line.

@wil3
Copy link
Owner

wil3 commented May 29, 2020

See updates to PR: #60 which fixes this bug.

@varunag18
Copy link

Hi Wil, I have executed the pid_example.py script and obtained the graph, as expected. Now, Im studying the PPO algo's base paper and also trying to understand the code in the files ppo_example.py, pposgd_simple.py, etc. I tried running the file ppo_example.py and obtained the below output. Checkpoints folder got created and files added to it at repeated time intervals, as expected. However, Gazebo didnt load and execution ends abruptly. While Im debugging the code to know whats going wrong, im adding the terminal output here to know your thoughts. Also, nothing gets recorded in the log file created in /tmp folder.

/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
WARNING:tensorflow:From ppo_example.py:22: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

WARNING:tensorflow:From ppo_example.py:22: The name tf.logging.ERROR is deprecated. Please use tf.compat.v1.logging.ERROR instead.

Storing results to  ../../checkpoints/baselines_1cfd0f0_200608-184033
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
  warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
Sending motor control signals to port  9803
Gazebo Model Path = /home/varun/local/share/gazebo-10/models::/home/varun/projects/gymfc/gymfc/envs/assets/gazebo/models:/home/varun/projects/gymfc/examples/gymfc_nf/twins
Gazebo Plugin Path = /home/varun/local/lib/gazebo-10/plugins::/home/varun/projects/gymfc/gymfc/envs/assets/gazebo/plugins/build:/home/varun/projects/gymfc/examples/gymfc_nf/twins/nf1/plugins/build
Starting gzserver with process ID= 7102
2020-06-08 18:40:45.153669: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-08 18:40:45.253970: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1800000000 Hz
2020-06-08 18:40:45.325966: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x37b8bb0 executing computations on platform Host. Devices:
2020-06-08 18:40:45.326004: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
Logging to /tmp/openai-2020-06-08-18-40-45-327605
[INFO] Keeping  100  checkpoints
2020-06-08 18:40:47.141962: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/numpy/core/_methods.py:38: RuntimeWarning: overflow encountered in reduce
  return umr_sum(a, axis, dtype, out, keepdims, initial, where)
/home/varun/projects/gymfc/openai-baseline/baselines/common/mpi_moments.py:22: RuntimeWarning: overflow encountered in square
  sqdiffs = np.square(x - mean)

Simulation Stats
-----------------
steps                  1004763
packets_dropped        0
time_start_seconds     1591621835.093218
time_lapse_hours       0.786911344329516


Killing Gazebo process with ID= 7102

@wil3
Copy link
Owner

wil3 commented Jun 8, 2020

Is this the output of ppo_example.py without any modification or overrides? If so looks like its only executing 1/10 of the # of steps but no error is given.

Which log file is supposed to be written to /tmp? Is this an OpenAI thing?

I havn't done thorough testing of the ppo refactor so its possible there is a bug. Simulation stats only get output when it shuts down. If you look here calling close is the only instance calling shutdown happens when there is no issue which appears by the output you provided. So it seems somewhere the close() on the gym environment is happening .

It this reproducible? What happens if you override timesteps to say 100e3 and 200e3?

@varunag18
Copy link

Is this the output of ppo_example.py without any modification or overrides? If so looks like its only executing 1/10 of the # of steps but no error is given.

I did modify the timesteps to 10e5 and ckpt-freq to 100e2, thats why its running fr 1/10 of the steps.

Which log file is supposed to be written to /tmp? Is this an OpenAI thing?

There is a log file getting created in /tmp folder as per baselines/logger.py and there are multiple calls to the logger.log method in pposgd_simple.py file. I was expecting something recorded in the log file.

I havn't done thorough testing of the ppo refactor so its possible there is a bug. Simulation stats only get output when it shuts down. If you look here calling close is the only instance calling shutdown happens when there is no issue which appears by the output you provided. So it seems somewhere the close() on the gym environment is happening .

Yes, env.close() is called in the last line of train method of ppo_example.py. Thats expected, right?

It this reproducible? What happens if you override timesteps to say 100e3 and 200e3?

Not really, when i modified the timesteps to 100e3 or 200e3, there is again an abrupt end to the execution but even one step earlier.

/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/home/varun/projects/gymfc/envfc/lib/python3.6/site-packages/numpy/core/_methods.py:38: RuntimeWarning: overflow encountered in reduce
  return umr_sum(a, axis, dtype, out, keepdims, initial, where)

Simulation Stats
-----------------
steps                  202797
packets_dropped        0
time_start_seconds     1591670487.1315815
time_lapse_hours       0.18189973599380918


Killing Gazebo process with ID= 2848

@wil3
Copy link
Owner

wil3 commented Jun 9, 2020

I did modify the timesteps to 10e5 and ckpt-freq to 100e2, thats why its running fr 1/10 of the steps.

I'm confused then, why are you saying it's not working?

@varunag18
Copy link

varunag18 commented Jun 9, 2020

I guess I posted the query a bit too early ;)
I need to study the code in detail to understand what exactly it is doing. Lemme spend some time on it and will get back to you. Thanks fr ur prompt replies.

@wil3
Copy link
Owner

wil3 commented Jun 9, 2020

All the PPO example does is train and produce checkpoints. If it's throwing you off it's not producing a plot like the pid example it's because it's not implented too, there's no current validation happening. There are a number of ways to do this. What I've done in the past is monitor the checkpoints directory and when a new checkpoint is created validate it on N number of episodes. Then you can get an idea how well the performance is. The neurocontroller I have in Neuroflight I believe was found in just 2 million steps. I have a script I've be meaning refactor and push will try and get to it in the next couple days.

@varunag18
Copy link

Got it. Indeed I was expecting PPO example to provide a graph or something to compare and show that its output is better than that of PID example, coz thats the whole point of doing a task through RL algos. I need some more time understanding this PPO code, meanwhile waiting fr this script ur planning to add to the repository.

@wil3 wil3 added the good first issue Good for newcomers label Jun 13, 2020
@wil3 wil3 changed the title PPO_Example.py ppo_example.py Jun 13, 2020
@wil3
Copy link
Owner

wil3 commented Jun 17, 2020

@varunag18 @xabierolaz PR #75 adds example evaluation and plotting scripts.

@wil3
Copy link
Owner

wil3 commented Jun 23, 2020

@xabierolaz given the updates to the examples/README.md I think it covers your original question. Unless there is an objection I'm going to close this issue. If something else comes up feel free to open another issue.

@xabierolaz
Copy link
Author

xabierolaz commented Jul 2, 2020

I'm facing same issue as @varunag18 was having while running the pid_example script
Have followed steps in examples/readme

(env) cuda@eim-alu-83252:~/workspace/gymfc/examples$ python3 pid_example.py  
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/cuda/workspace/gymfc/env/lib/python3.6/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
  warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
Sending motor control signals to port  9224
Gazebo Model Path = /usr/local/share/gazebo-10/models::/home/cuda/workspace/gymfc/gymfc/envs/assets/gazebo/models:/home/cuda/workspace/gymfc/examples/gymfc_nf/twins
Gazebo Plugin Path = /usr/local/lib/gazebo-10/plugins::/home/cuda/workspace/gymfc/gymfc/envs/assets/gazebo/plugins/build:/home/cuda/workspace/gymfc/examples/gymfc_nf/twins/nf1/plugins/build
Starting gzserver with process ID= 35372
Timeout communicating with flight control plugin.

Simulation Stats
-----------------
steps                  0
packets_dropped        0
time_start_seconds     1593682944.1811707
time_lapse_hours       0.016422869430647954


Killing Gazebo process with ID= 35372
Timeout communicating with flight control plugin.

@wil3
Copy link
Owner

wil3 commented Jul 2, 2020

You timed out which usually means the motor plugins were never built. Have you pulled the latest from the master branch? Is the pid example working?

@xabierolaz
Copy link
Author

xabierolaz commented Jul 3, 2020

getting same error. Seems like I havent built the plugins yet, as pointed out in imu missing while running start_sim example.
May I use the same plugins-built nf1 model in examples/gymfc_nf folder for both tests gymfc scripts (test_Start_sim.py) and the nf ones ? (ppo_baselines_train.py) or each folder must have its own nf1 model?

Do Ihave to install gymfc again as said in examples/readme?

when running test_start_sim.py after having /examples nf1 aircraft plugins built (looks fine)

(env) cuda@eim-alu-83252:~/workspace/gymfc$ python3 tests/test_start_sim.py --verbose examples/gymfc_nf/twins/nf1/model.sdf 
Sending motor control signals to port  9300
Gazebo Model Path = /usr/local/share/gazebo-10/models::/home/cuda/workspace/gymfc/gymfc/envs/assets/gazebo/models:/home/cuda/workspace/gymfc/examples/gymfc_nf/twins
Gazebo Plugin Path = /usr/local/lib/gazebo-10/plugins::/home/cuda/workspace/gymfc/gymfc/envs/assets/gazebo/plugins/build:/home/cuda/workspace/gymfc/examples/gymfc_nf/twins/nf1/plugins/build
Starting gzserver with process ID= 6503
Gazebo multi-robot simulator, version 10.1.0
Copyright (C) 2012 Open Source Robotics Foundation.
Released under the Apache 2 License.
http://gazebosim.org

[Msg] Waiting for master.
[Msg] Connected to gazebo master @ http://127.0.0.1:12131
[Msg] Publicized address: 172.18.83.252
[Wrn] [DARTPhysics.cc:96] Gravity vector is (0, 0, 0). Objects will float.
[Dbg] [DARTModel.cc:72] Initializing DART model attitude_control_training_rig
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'pivot' and joint 'base_joint'.
[Err] [DARTJoint.cc:195] DARTJoint: SetAnchor is not implemented.
[Dbg] [FlightControllerPlugin.cpp:225] Binding on port 9300
[Dbg] [FlightControllerPlugin.cpp:408] CoT link=battery
[Dbg] [FlightControllerPlugin.cpp:410] Got COT from plugin 0 0 0.058
[Dbg] [FlightControllerPlugin.cpp:413] Num motors 4
[Dbg] [FlightControllerPlugin.cpp:443] Inserting digital twin from SDF, examples/gymfc_nf/twins/nf1/model.sdf.
[Dbg] [DARTModel.cc:72] Initializing DART model nf1
[Dbg] [DARTModel.cc:128] Building DART BodyNode for link 'frame' with a free joint.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'battery' and joint 'battery_joint'.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'motor_1' and joint 'motor_1_joint'.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'motor_2' and joint 'motor_2_joint'.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'motor_3' and joint 'motor_3_joint'.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'motor_4' and joint 'motor_4_joint'.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'prop_1' and joint 'prop_1_joint'.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'prop_2' and joint 'prop_2_joint'.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'prop_3' and joint 'prop_3_joint'.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'prop_4' and joint 'prop_4_joint'.
[Dbg] [DARTModel.cc:166] Building DART BodyNode for link 'fc_stack' and joint 'fc_stack_joint'.
[Dbg] [gazebo_motor_model.cpp:204] Loading Motor number=0 Subscribed to /aircraft/command/motor
[Dbg] [gazebo_motor_model.cpp:204] Loading Motor number=1 Subscribed to /aircraft/command/motor
[Dbg] [gazebo_motor_model.cpp:204] Loading Motor number=2 Subscribed to /aircraft/command/motor
[Dbg] [gazebo_motor_model.cpp:204] Loading Motor number=3 Subscribed to /aircraft/command/motor
[Dbg] [gazebo_imu_plugin.cpp:55] Loading IMU sensor
[Dbg] [FlightControllerPlugin.cpp:527] Setting link to center battery
[Dbg] [FlightControllerPlugin.cpp:542] Aircraft model fixed to world

@varunag18
Copy link

Hi @xabierolaz If your test scripts are running fine, your pid_example should run too. I havent installed the plugins at more than 1 location. All scripts in test and examples folder are running fine for me. I will state the set of steps I did, maybe you can cross check once.
Cloned gymfc_aircraft_plugins into /projects
create build folder in /projects/gymfc_aircraft_plugins
cd build
cmake
make
In the folder /gymfc/examples/gymfc_nf/twins/nf1/plugins/build, I executed the code
ln -l ~/projects/gymfc_aircraft_plugins/build/libgazebi_imu_plugin.so libgazebo_imu_plugin.so
Similarly, run the ln -l command for libgazebo_motor_model.so

@wil3
Copy link
Owner

wil3 commented Jul 7, 2020

Please install plugins according to the readme, if there is something wrong with installation instructions please submit a bug report.

A model is reusable, you build the plugins once and then you just point to the models SDF file to use it with GymFC.

Do Ihave to install gymfc again as said in examples/readme?

Not sure what you mean by this. You can only have a single Python package installed at a time. If you install via pip without edit mode (the -e flag) and you make modifications to the source then yes you need to uninstall and reinstall but for the examples you don't need to make any modifications.

If there is confusion with installation of GymFC compared to GymFC-NF, GymFC does dynamically build the Gazebo C++ plugins during pip install because they are required by GymFC. Aircraft models (ie digital twins) and their corresponding motor and sensor plugins are totally independent entities by design. This allows you to test multiple aircraft models with GymFC by just pointing to a different model.sdf file.

@wil3
Copy link
Owner

wil3 commented Jul 7, 2020

I also want to mention, there is no point in running the PPO Neuro Controller until you have the PID example working. PPO is much more complex.

@xabierolaz
Copy link
Author

xabierolaz commented Jul 13, 2020

Sorry, let me get this clear still have few questions about the structure. What I meant is that inside examples folder there is another gymfc (gymfc_nf) installation, so Im assuming we have to either pick the root folders gymfc installation or the gymfc_nf one, right?

1-Whenever installing gymfc do we have to choose between installing the gymfc at the main root folder or installing examples/gymfc_nfc, which follows gymfc's installation process but also runs automatically the manual building process we had to do when installing gymfc at the end (gymfc/envs/assets/gazebo/plugins/build_plugin.sh) Is this right? should we run too pip3 install -e . inside the GymFC_NF as we did in the normal gymfc install?

2- In examples/readme, at the end of step 2

navigate to the directory you cloned the OpenAI Baselines fork and installed it,

Should we have it installed by then or you mean
navigate to the directory you cloned the OpenAI Baselines fork and install it,

3- In examples/readme, pip3 install is mentioned at the end of step 2 and at the beginning of step 3.
Does it mean we have to install it twice? or you mean we have to mkdir gymfc_nf/twins/nf1/plugins inside the openai-baselines git once we have pip3 installed it ?

I know those might be trivial questions, but need to get this clear in order to know why still not working

@wil3
Copy link
Owner

wil3 commented Jul 15, 2020

so Im assuming we have to either pick the root folders gymfc installation or the gymfc_nf one, right?

No, GymFC is the framework. It provides the interfaces to Gazebo and the digital twin. GymFC_NF is an implementation of the GymFC framework used for training neurocontrollers. They are completely independent python packages. GymFC is a dependency of GymFC_NF. I highly recommend you review the code.

  1. OpenAI is only needed for training, before you run the PPO example you need it installed. That's a typo it should be install it with pip3.

  2. Pip is used to install python packages. If you already have the gymfc_nf package installed the pip install step 3 is redundant.

Since this thread hasn't been related to the specific PPO example I'm going to close it. Before you get to the examples you need to verify your motor and sensor plugins are installed and operating correctly which you can do using the GymFC test scripts.

Feel free to open a follow up issue if needed.

@wil3 wil3 closed this as completed Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants