Fix issues with relating to resets and reward function. #3

The-Real-Thisas · 2025-04-01T13:22:48Z

I could not get the original env to work when trying to train an agent via sb3 using SAC. I had an issue where the environment just repeatedly reset, normally this has to due with some weird input the model does on the environment, in this case I assumed it was just backwards movement causing it to fail. But after a few hours the agent made no progress and that was a red flag. I didn't assume it was a issue with the env at first because the sample code seems to work but actually when I looked at the logs in some attempts the agent did do full throttle and reverse was actually impossible because the agent is stuck at first gear (unless the agent rolls down a slope which was not the case here).

So instead of the model doing the action I quickly mocked up a demo script that lets me drive within the environment and what I discovered was the reset was constantly being triggered. Moreover when I don't reset the environment and just drive around the track I can see the reward was -1 always not matter how fast or slow I went. It was only when I went backwards it started acting normal. This was when I realized this reward function was totally broken and so was the resetting code.

The spline code is for the most part is fine and actually seems to work I just had to rework the reward function,

Among a few other things I've basically,

Add more comprehensive reward structure
Done explicit checks for maximum episode steps
Standstill detection (terminates if vehicle is still for >20 steps)
Additional angle-based penalties to encourage proper vehicle alignment

I was finally able to train SAC model that was able to complete the track (albeit quite slowly). Unfortunately, I want to share the model file you can test it yourself but the GPU of my work station stopped working and I don't have internal graphics so I can't retrieve the model file because the PC won't boot.

However without any hyperparameter tuning within about an hour you should see a working model with my version of the code with SAC and SB3.

The-Real-Thisas · 2025-04-01T13:25:25Z

By the way this is the result of multiple changes to the environment trying to get a working model so there might be some commits for experiments that didn't work out like random spawning which I initially added in and after testing removed so I would only look at the final diff.

The-Real-Thisas · 2025-04-02T16:25:46Z

@aivora-beamng any feedback ?

AbdelrahmanElsaidElsawy · 2025-04-02T20:41:57Z

Thanks again for your contribution!

Before we can proceed with the review and merge, please make sure to sign the BeamNG Contributor License Agreement (CLA) as outlined in our contributing guide:
https://github.com/BeamNG/BeamNG.gym/blob/master/CONTRIBUTING.md

You can sign the CLA using the following form:
https://docs.google.com/forms/d/17eWfaz6Xbn120hnYTaZnhGX1Lzg-LGNaN3VklrjXCyY/viewform?edit_requested=true

Once you've completed the form, just let us know here. We’ll move forward with reviewing and testing your pull request after that.

Thanks again!

The-Real-Thisas · 2025-04-02T21:13:47Z

Hi, I’ve already signed it.

The-Real-Thisas · 2025-04-03T06:55:27Z

Hi any updates ?

Also why is the CLA required in this repo ? I know that BeamNG.tech is proprietary but this repo is licensed MIT so why add the extra friction of a CLA which some would be hesitant to sign.

The-Real-Thisas · 2025-04-04T08:14:53Z

@AbdelrahmanElsaidElsawy Hi, in an email you sent me an issue with the README code giving a value error here on GitHub it doesn't seem to be appearing.

AbdelrahmanElsaidElsawy · 2025-04-04T12:30:48Z

Hi @The-Real-Thisas,

Thanks again for your contribution!

Just a quick update — testing is currently ongoing. We're evaluating your changes alongside another active branch as part of our efforts to improve the environment.

We also have an upcoming release planned very soon, so we're coordinating the timing of mergers carefully. Once we’ve had a chance to fully evaluate everything, we’ll proceed with merging accordingly.

We appreciate your patience and support!

Best regards,

The-Real-Thisas · 2025-04-04T12:47:03Z

No issues, I also want to implement more things into the project starting with a full overhaul to documentation including examples of using the gym environment with RL frameworks like SB3 and then later adding an additional environment that supports a camera based observation for training vision based models. I think it’s also important to prioritize and consider adding parallelization capabilities to the environment so that training can be done faster and more diverse samples can be collected. For this it would be useful to have contact with the BeamNG team and in turn contact with BeamNG.tech developers so that we can coordinate and implement these features.

For that case I do want to know if there is possibly of doing internship with BeamNG ? I’ve already sent an email to [email protected] but yet to receive a reply. If you think the addition support would be helpful I could appreciate a kind word with the team. I think we can really increase adoption of beamNG.tech if we are able to create environments for out of the box training with popular rl frameworks same just as other physics based simulators.

Do let me know your thoughts.

AbdelrahmanElsaidElsawy · 2025-04-04T13:45:49Z

Hi @The-Real-Thisas,

Thanks again for your contribution!

Just a heads-up — the master branch has recently been updated. Sorry for the late notice, the repo hadn’t been maintained for a while. When you have a chance, please rebase your changes onto the latest master so we can test and merge more smoothly.

Let us know if you need help with that!

Best,

AbdelrahmanElsaidElsawy · 2025-04-04T13:49:41Z

Hi @The-Real-Thisas,

Thank you for sharing your plans — we're really happy to see your enthusiasm and the great ideas you have for improving the project. Your contributions and suggestions are very valuable, and we agree that making BeamNG.gym more accessible and robust could greatly enhance its adoption in the RL community.

Regarding your internship inquiry: our HR team has received your email and will take your application into consideration. If there’s a suitable opportunity, they’ll get back to you directly.

Thanks again for your continued efforts and contributions!

The-Real-Thisas and others added 12 commits March 13, 2025 05:39

crashes and track limits should result in termination not truncation

4051b4b

Add timestep based termination

96b24b5

fix reward

a97df67

add back extra obs from electrics

acbb502

random pos spawn

a34ba27

fix issue with obs space commented out

bfb896c

add more time to settle physics and don't make z too high

8daf90f

add more steps to settle

a5ff8da

fix going into the floor

14d0514

increase timeout time

b7d2607

fix sideways spawn

b2c9689

add angle punishment and revert random spawn

821527c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issues with relating to resets and reward function. #3

Fix issues with relating to resets and reward function. #3

The-Real-Thisas commented Apr 1, 2025

The-Real-Thisas commented Apr 1, 2025

The-Real-Thisas commented Apr 2, 2025

AbdelrahmanElsaidElsawy commented Apr 2, 2025

The-Real-Thisas commented Apr 2, 2025

The-Real-Thisas commented Apr 3, 2025

The-Real-Thisas commented Apr 4, 2025

AbdelrahmanElsaidElsawy commented Apr 4, 2025

The-Real-Thisas commented Apr 4, 2025

AbdelrahmanElsaidElsawy commented Apr 4, 2025

AbdelrahmanElsaidElsawy commented Apr 4, 2025

Fix issues with relating to resets and reward function. #3

Are you sure you want to change the base?

Fix issues with relating to resets and reward function. #3

Conversation

The-Real-Thisas commented Apr 1, 2025

The-Real-Thisas commented Apr 1, 2025

The-Real-Thisas commented Apr 2, 2025

AbdelrahmanElsaidElsawy commented Apr 2, 2025

The-Real-Thisas commented Apr 2, 2025

The-Real-Thisas commented Apr 3, 2025

The-Real-Thisas commented Apr 4, 2025

AbdelrahmanElsaidElsawy commented Apr 4, 2025

The-Real-Thisas commented Apr 4, 2025

AbdelrahmanElsaidElsawy commented Apr 4, 2025

AbdelrahmanElsaidElsawy commented Apr 4, 2025