Skip to content

graceebc9/agent_intentions

Repository files navigation

Agent Intentions

Introduction / Summary

This Github is the companion to the paper Evaluating Language Model Language Traits

Repository Structure

/LGBT

/COHERENCE

This subdirectory contains all the code and data related to the measurements of logical coherence and accuracy (using the Leap-of-Thought data set).

/HHH

This subdirectory contains all code and data related to the generation of the Helpful Harmless (HHH) dataset and subsequent testing.

/UII

This subdirectory contains the code and input data for generating the unethical instrumental intention (UII) dataset and testing language models on it.

Others

  • all_plots.ipynb: Contains code that generates the distribution plots fig 1, 3, 4 in the paper.
  • tqa.py: Contains code that generates the distribution plots fig 5, 6 in the paper.
  • requirements.txt: Lists dependencies required, run pip install -r requirements.txt to install all the packages.

About

Repor relating to the paper 'Agent Intentions'

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •