Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plots or graphs #46

Open
SyedHasnat opened this issue Jan 18, 2024 · 0 comments
Open

Plots or graphs #46

SyedHasnat opened this issue Jan 18, 2024 · 0 comments

Comments

@SyedHasnat
Copy link

Hello dear,
I hope message finds you well. My name is Syed Hasnat, and I am currently working on a project that involves implementing the TD3 algorithm. I have found your research paper on TD3 to be extremely insightful, and I am particularly interested in reproducing the graphs mentioned in the paper for my work.

I have been exploring the open-source code of TD3, but I'm facing some challenges in extracting the specific parameters used to generate the graphs mentioned in your paper. I would greatly appreciate it if you could share the code snippets or guide me on which parameters from the open-source code were used to create those graphs.

Further more in the paper it's mentioned that

" In
Figure 1, we graph the average value estimate over 10000
states and compare it to an estimate of the true value. The true value is estimated using the average discounted return over 1000 episodes following the current policy, starting from states sampled from the replay buffer. "

So, how you are relating 10k states and 1k episodes?
After that you have told that

"A very clear overestimation bias occurs from the learning procedure, which contrasts with the novel method that we describe in the following section, Clipped Double Q-learning, which greatly reduces overestimation by the critic."

So, where is the novel method in the graph?

Sorry to say, that overall I am not getting the idea how you have plotted these graphs and which data should be utalised in a proper manner to plot these graphs.

Your assistance will be invaluable for my project, and I believe it will enhance the overall quality of my work. I understand your time is valuable, and I appreciate any support you can provide.

Thank you for your consideration. Looking forward to your guidance.

Best regards,

Syed Hasnat
USPCAS-E (US Pakistan Center For Advanced Studies in Energy)
[email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant