Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which commit matches your published work (for HumanEval-rs)? #23

Open
geekoftheweek opened this issue Oct 9, 2023 · 5 comments
Open

Comments

@geekoftheweek
Copy link

I'd like to run your code as it was implemented for the original paper. Would it be possible to add a link in the README that points to the specific commit representing the code used to achieve your results as published in the Reflexion paper? Thanks!

@geekoftheweek
Copy link
Author

Critical reading failure on my part -- I see now that you do reference how to find the original code. Closing this. Thank you!

@geekoftheweek geekoftheweek changed the title Which commit matches your published work? Which commit matches your published work (for HumanEval-rs)? Oct 9, 2023
@geekoftheweek
Copy link
Author

Reopening, as I can't find any code in reflexion-draft that references your published results against HumanEval-rs.

@geekoftheweek geekoftheweek reopened this Oct 9, 2023
@noahshinn
Copy link
Owner

Hi @geekoftheweek , you can use the run script here https://github.com/noahshinn024/reflexion/blob/main/programming_runs/run_reflexion.sh. The only changes have been refactorings on the original code. Let me know if there are any questions

@ai-nikolai
Copy link

@noahshinn @geekoftheweek thanks for this thread.

Are there any updates on the above.

Also, I am struggling to understand the results on alfworld (and how to reproduce them), as e.g. env 1-3 all seem to have been successful for you, however, it never reaches success if running the script. (Unless the order of envs is different?)

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants