-
Notifications
You must be signed in to change notification settings - Fork 9
Fix issues with relating to resets and reward function. #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
By the way this is the result of multiple changes to the environment trying to get a working model so there might be some commits for experiments that didn't work out like random spawning which I initially added in and after testing removed so I would only look at the final diff. |
@aivora-beamng any feedback ? |
Thanks again for your contribution! Before we can proceed with the review and merge, please make sure to sign the BeamNG Contributor License Agreement (CLA) as outlined in our contributing guide: You can sign the CLA using the following form: Once you've completed the form, just let us know here. We’ll move forward with reviewing and testing your pull request after that. Thanks again! |
Hi, I’ve already signed it. |
Hi any updates ? Also why is the CLA required in this repo ? I know that BeamNG.tech is proprietary but this repo is licensed MIT so why add the extra friction of a CLA which some would be hesitant to sign. |
@AbdelrahmanElsaidElsawy Hi, in an email you sent me an issue with the README code giving a value error here on GitHub it doesn't seem to be appearing. |
Hi @The-Real-Thisas, Thanks again for your contribution! Just a quick update — testing is currently ongoing. We're evaluating your changes alongside another active branch as part of our efforts to improve the environment. We also have an upcoming release planned very soon, so we're coordinating the timing of mergers carefully. Once we’ve had a chance to fully evaluate everything, we’ll proceed with merging accordingly. We appreciate your patience and support! Best regards, |
No issues, I also want to implement more things into the project starting with a full overhaul to documentation including examples of using the gym environment with RL frameworks like SB3 and then later adding an additional environment that supports a camera based observation for training vision based models. I think it’s also important to prioritize and consider adding parallelization capabilities to the environment so that training can be done faster and more diverse samples can be collected. For this it would be useful to have contact with the BeamNG team and in turn contact with BeamNG.tech developers so that we can coordinate and implement these features. For that case I do want to know if there is possibly of doing internship with BeamNG ? I’ve already sent an email to [email protected] but yet to receive a reply. If you think the addition support would be helpful I could appreciate a kind word with the team. I think we can really increase adoption of beamNG.tech if we are able to create environments for out of the box training with popular rl frameworks same just as other physics based simulators. Do let me know your thoughts. |
Hi @The-Real-Thisas, Thanks again for your contribution! Just a heads-up — the master branch has recently been updated. Sorry for the late notice, the repo hadn’t been maintained for a while. When you have a chance, please rebase your changes onto the latest master so we can test and merge more smoothly. Let us know if you need help with that! Best, |
Hi @The-Real-Thisas, Thank you for sharing your plans — we're really happy to see your enthusiasm and the great ideas you have for improving the project. Your contributions and suggestions are very valuable, and we agree that making BeamNG.gym more accessible and robust could greatly enhance its adoption in the RL community. Regarding your internship inquiry: our HR team has received your email and will take your application into consideration. If there’s a suitable opportunity, they’ll get back to you directly. Thanks again for your continued efforts and contributions! |
I could not get the original env to work when trying to train an agent via sb3 using SAC. I had an issue where the environment just repeatedly reset, normally this has to due with some weird input the model does on the environment, in this case I assumed it was just backwards movement causing it to fail. But after a few hours the agent made no progress and that was a red flag. I didn't assume it was a issue with the env at first because the sample code seems to work but actually when I looked at the logs in some attempts the agent did do full throttle and reverse was actually impossible because the agent is stuck at first gear (unless the agent rolls down a slope which was not the case here).
So instead of the model doing the action I quickly mocked up a demo script that lets me drive within the environment and what I discovered was the reset was constantly being triggered. Moreover when I don't reset the environment and just drive around the track I can see the reward was -1 always not matter how fast or slow I went. It was only when I went backwards it started acting normal. This was when I realized this reward function was totally broken and so was the resetting code.
The spline code is for the most part is fine and actually seems to work I just had to rework the reward function,
Among a few other things I've basically,
I was finally able to train SAC model that was able to complete the track (albeit quite slowly). Unfortunately, I want to share the model file you can test it yourself but the GPU of my work station stopped working and I don't have internal graphics so I can't retrieve the model file because the PC won't boot.
However without any hyperparameter tuning within about an hour you should see a working model with my version of the code with SAC and SB3.