Dataset: data/dilemmas_with_detail_by_action.csv
Croissant metadata: metadata/crossiant
Three use cases -- 1. evaluate model preferences on everyday dilemmas 2. evaluate model preference on dilemmas based on the given principle 3. evaluate the model's steerability by system prompt modulation on each given principle
We used gpt-4-turbo as demonstrated below. We also provided our model responses on our dilemmas dataset run by five models, and some corresponding analysis datasets for replication purposes to avoid running them again through the flag (replication_purpose).
$ conda create --name <env> --file requirements.txt
python eval/evaluate_model_on_dilemma.py \
--output_jsonl_file_div <output_jsonl_file_div> \
--model <gpt-4-turbo> \
--model_name_for_variable <own_created_name_e.g.gpt4> \
--replication_purpose <full/only_first_five> \
--api_key <optional_you_can_set_as_.env_or_global>
Required Arguments with suggestions:
output_jsonl_file_div:eval/model_response_from_eval.model name:gpt-4-turbo.replicaiton_purpose: [full,only_first_five]: [full dilemmas dataset, only first five dilemmas]
Optional Arguments:
api_key: you can input OpenAI API key as an argument. Use a .env file or set as a global variable
python eval/combine_model_eval_with_dilemmas_action_separate.py \
--input_model_resp_file <step_1_output_jsonl_file> \
--input_dilemma_file <MoralDilemma_dataset> \
--output_analysis_file <output_analysis_file_div> \
--model <model_name_for_variable>
Required Arguments with suggestions:
input_model_resp_file:eval/model_response_from_evalinput_dilemma_file:data/dilemmas_with_detail_by_action.csvordata/dilemmas_with_detail_by_action_test.csvif use only first five data in step 1output_analysis_file:eval/model_response_from_eval. The data of<model_name>_eval_on_dilemmas_with_action_separate.csvwill be created.
3. Analyze the combined model responses file with dilemmas details from step 2 based on the five theories.
python analysis/analysis_for_model_responses_five_theories/analyze_model_resp_with_five_theories.py \
--input_model_resp_file <output_from_step_2> \
--output_analysis_file <e.g.analysis/analysis_for_model_responses_five_theories/model_resp.csv> \
--models <model_names>
Required Arguments with suggestions:
input_model_resp_file: use output file from step 2. e.g.,eval/model_response_from_eval/<model_name>_eval_on_dilemmas_with_action_separate.csv. We also provided the six model responses ineval/model_response_from_eval/all_models_eval_on_dilemmas_with_detail_by_action.csvmodels:gpt4; For six models,gpt4 gpt35 llama2 llama3 mixtral_rerun claude
python analysis/analysis_for_model_responses_five_theories/create_plot.py \
--input_analysis_file <output_path_in_step_3/model_resp.csv> \
--output_graph_div <path_for_graphs>
Required Arguments with suggestions:
input_analysis_file: Output from step 3 e.g.,analysis/analysis_for_model_responses_five_theories/
1. Get the relevant values for each principle from collected values by prompting the model 10 times.
python eval_for_system_prompt_modulation/get_values_for_principle.py \
--input_principle_file <data/principle_company.csv> \
--output_jsonl_file_div <e.g._eval_for_system_prompt_modulation/values_and_system_prompt_for_principle> \
--model <openai_model_e.g._gpt-4-0125-preview> \
--replication_purpose <full_/or/_only_first_two_principles>\
--api_key <optional_you_can_set_as_.env_or_global>
Required Arguments with suggestions:
input_principle_file:data/principle_openai.csvmodel:gpt-4-0125-previewreplicaiton_purpose: [full,only_first_two_principles]
Optional Arguments:
api_key: you can input OpenAI API key as an argument. Use a .env file or set as a global variable
2. Calculate the value's relevance by the empirical probabilities from the 10 model responses in step 1.
python eval_for_system_prompt_modulation/calc_value_relevance.py \
--input_system_prompt_file <output_from_step_1> \
--output_jsonl_file_div <e.g._eval_for_system_prompt_modulation/values_and_system_prompt_for_principle>
Required Arguments with suggestions:
input_system_prompt_file: Output from step 1 e.g.,eval_for_system_prompt_modulation/values_and_system_prompt_for_principle/principle_openai.csv
python eval_for_system_prompt_modulation/get_values_conflicts_and_relevant_dilemmas_for_principle.py \
--input_principle_file <output_from_step_2_e.g._principle_company_clean.csv> \
--input_dilemma_file <output_from_case_1_step_2> \
--output_jsonl_file_div <eval_for_system_prompt_modulation/values_and_system_prompt_for_principle> \
--model <model_name> \
Required Arguments with suggestions:
input_principle_file: Output from step 2. e.g.,eval_for_system_prompt_modulation/values_and_system_prompt_for_principle/<model_name>_clean.csvinput_dilemma_file: Output from Step 2 of Case 1. We also provided the six models responses:eval/model_response_from_eval/all_models_eval_on_dilemmas_with_detail_by_action.csv
1. Generate system prompts for each principle -- one for steering to supporting values and one for steering to opposing values for each principle
python eval_for_system_prompt_modulation generate_system_prompts_for_principle.py \
--input_system_prompt_file <output_from_step_3_in_case_2> \
--output_jsonl_file_div <e.g._eval_for_system_prompt_modulation/values_and_system_prompt_for_principle> \
--model gpt-4-turbo \
--api_key <optional_you_can_set_as_.env_or_global>
Required Arguments with suggestions:
input_system_prompt_file: Output from step 3 of case 2: e.g.,<output_path>/principle_gpt4_eval_with_values_conflicts_and_dilemma.csv
Optional Arguments:
api_key: you can input OpenAI API key as an argument. Use a .env file or set as a global variable
python eval_for_system_prompt_modulation/evaluate_model_on_system_prompt.py \
--input_system_prompt_file <output_from_step_1> \
--input_dilemma_file <data/dilemmas_with_detail_by_action.csv> \
--output_jsonl_file_div <e.g._eval_for_system_prompt_modulation/model_response_on_system_prompt/> \
--model gpt-4-turbo \
--api_key <optional_you_can_set_as_.env_or_global>
Required Arguments with suggestions:
input_system_prompt_file: Output from step 1: e.g.,<output_path>/<model>_with_system_prompts.csvinput_dilemma_file: MoralDilemmas data: ```data/dilemmas_with_detail_by_action.csv``model:gpt-4-turbo
Optional Arguments:
api_key: you can input OpenAI API key as an argument. Use a .env file or set as a global variable
python analysis/analysis_for_system_prompt_modulation/analyze_model_resp_on_system_prompt.py \
--input_principle_file <output_from_step_2> \
--input_dilemma_file <output_from_case_1_step_2_or_provided_file> \
--output_div <e.g.analysis/analysis_for_system_prompt_modulation> \
--steer <sup/opp> \
--company_name <use_for_create_output_file_name_e.g._openai> \
--model <the_name_used_for_field_and_also_for_output_file_name>
Required Arguments with suggestions:
input_principle_file: Output from step 2. e.g.,eval_for_system_prompt_modulation/model_response_on_system_prompt/{model}_eval.csvinput_dilemma_file: Output from Step 1 in Case 1:eval/model_response_from_eval/<model_name>_eval_on_dilemmas_with_detail_by_action.csvor our provided six model responses:eval/model_response_from_eval/all_models_eval_on_dilemmas_with_detail_by_action.csvsteer:supmeans that it analyzes the model responses when having system prompts steering to supporting values.
python analysis/analysis_for_system_prompt_modulation/create_plot.py \
--input_support_value_file <path/...system_prompt_sup_value.csv> \
--input_oppose_value_file <path/...system_prompt_sup_value.csv> \
--output_graph_div <analysis/analysis_for_system_prompt_modulation/> \
--replication_purpose <only_two_principles/or/full>
Required Arguments with suggestions:
input_[support/oppose]_value_file: Output from step 3 with steer == [sup/opp]:principle_<companyname>_<modelname>_eval_with_system_prompt_system_prompt_[sup/opp]_value.csvreplication_purpose: [only_two_principles/full]. Depends on the evaluation of Step 1 in case 2
