A flexible framework for evaluating constrained generation models, built for the GenLM ecosystem. This library provides standardized interfaces and benchmarks for assessing model performance across various constrained generation tasks.
- Getting Started: Visit our documentation for installation and usage guides.
- API Reference: Browse the API documentation for detailed information about the library's components.
- Cookbook: Check out our examples and tutorials for:
- Using built-in domains (Pattern Matching, Text-to-SQL, Molecular Synthesis)
- Creating custom evaluation domains
- Datasets: Specifies and iterates over the dataset instances of a constrained generation task.
- Evaluators: Evaluates the model's output.
- Model Adapters: Wraps the model to provide a unified interface for evaluation.
- Runners: Orchestrates the evaluation process with output caching.
Note: This library is still under active development.
git clone https://github.com/genlm/genlm-eval.git
cd genlm-eval
pip install -e .
For domain-specific dependencies, refer to the cookbook in the docs.