![]() |
The HoarePrompt-Experiment guide provides instructions on how to run the HoarePrompt tool for processing an entire dataset.
Before running the project, ensure that the necessary environment variables are set, including GROQ_API_KEY
and OPENAI_API_KEY
. These keys are essential for accessing the respective APIs that power the LLM services used in this experiment.
It is recommended to create a virtual environment to ensure dependency isolation and avoid conflicts with other projects. Follow these steps:
# Create a virtual environment
python3 -m venv hoareprompt-env
# Activate the virtual environment
source hoareprompt-env/bin/activate
Once your virtual environment is activated, install the necessary dependencies:
pip install -r requirements.txt
Additionally, install the hoareprompt
package locally using an editable install:
pip install -e /path/to/hoareprompt
Replace
/path/to/hoareprompt
with the actual path to thehoareprompt
directory from the GitHub repository. The-e
flag enables editable mode, meaning any changes made to thehoareprompt
package are immediately reflected without requiring reinstallation.
Depending on the LLM service you intend to use, set the appropriate environment variables:
For OpenAI models:
export OPENAI_API_KEY="your-openai-api-key"
For Groq models:
export GROQ_API_KEY="your-groq-api-key"
To persist these variables across terminal sessions, add the above export commands to your .bashrc
or .zshrc
file.
This guide helps run multiple test cases in JSON format efficiently. Configuration options are the same as those detailed in the main HoarePrompt README (../README.md
).
The input data file should be a JSON file containing test cases with the following fields:
description
: Natural language description of the problem.correct
: Ground truth label indicating correctness.unique_id
: Unique identifier for the test case.task_id
: Identifier related to the problem set.generated_code
: The program code to be evaluated.
You can use our dataset located at ../Results/Dataset
.
Execute the following command, specifying the data file, log directory, and configuration file paths:
python3 -m src.main_verify --data /path/to/data/file --log /path/to/log/dir --config /path/to/config/file
--data
: Path to the input JSON data file.--log
: Directory where logs will be stored.--config
: Path to the configuration file.
python3 -m src.main_verify --data data/input.json --log logs/experiment1 --config configs/custom_config.json
If you do not provide the --config
argument, the default configuration file (default_config.json
) will be used.
The main_verify
script should be used for running HoarePrompt with all available classifiers except for tester
.
If you want to use the tester
classifier, run the following command with the corresponding tester configuration:
python3 -m src.main_tester --data /path/to/data/file --log /path/to/log/dir --config /path/to/config/file
This ensures compatibility when testing HoarePrompt or HoarePrompt-No-Unroll configurations depending on the specified setup.
This guide provides a streamlined approach for executing HoarePrompt experiments efficiently. Refer to the main HoarePrompt documentation (../README.md
) for more in-depth details on configuration options and parameter settings.
The HoarePrompt project consists of multiple repositories, each serving a specific function:
-
- The core repository containing the implementation of HoarePrompt, including the logic for analyzing programs, state propagation, and correctness verification.
-
- A dedicated repository for storing experimental results and datasets related to HoarePrompt evaluations. This includes correctness assessments, counterexamples, and other findings from our testing.
-
HoarePrompt-experiments (This is the current repo you are on)
- This repository provides scripts and configurations to run large-scale experiments with HoarePrompt on various datasets. It includes automated evaluation scripts and batch processing tools.
-
CoCoClaNeL
This repository: the dataset and associated documentation.
This dataset is released under the MIT License.
See the LICENSE file for details.