Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make changes to hallucination pipeline #156

Merged
merged 71 commits into from
Nov 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
52e96ea
edit installation instructions in readme
gianlucadetommaso May 15, 2023
5e0076d
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso May 15, 2023
4c7fd28
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso May 15, 2023
6cb6581
bump up version
gianlucadetommaso May 15, 2023
1b39780
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso May 16, 2023
cb2b49a
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso May 16, 2023
14e3ca4
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso May 25, 2023
580067d
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso May 27, 2023
048ef09
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 2, 2023
ad542a4
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 12, 2023
41417c1
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 12, 2023
64be374
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 14, 2023
a2d0f34
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 14, 2023
66bba06
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 15, 2023
911aa82
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 15, 2023
01f959b
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 15, 2023
79f8dca
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 15, 2023
4dea50f
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jun 21, 2023
1ced008
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 18, 2023
6992692
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 18, 2023
b2540c1
make small change in readme because of publish to pypi error
gianlucadetommaso Jul 18, 2023
2362998
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 18, 2023
6e030f2
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 25, 2023
9bd6f67
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 25, 2023
c5bc94f
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 25, 2023
d3ab46b
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 26, 2023
0e2aca5
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 26, 2023
9520273
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 30, 2023
e9c4108
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 30, 2023
bc64a01
bump up version
gianlucadetommaso Jul 30, 2023
25072da
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 30, 2023
e27b378
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Jul 30, 2023
a175e16
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Aug 1, 2023
6e202f1
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Aug 1, 2023
635e7c9
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Aug 9, 2023
8e23b32
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Aug 16, 2023
f5efef8
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Aug 24, 2023
958b245
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Aug 24, 2023
577d169
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Aug 28, 2023
69a454e
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Aug 30, 2023
6e880ba
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Aug 30, 2023
f606545
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 11, 2023
63e09bb
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 11, 2023
b2402b5
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 12, 2023
591d842
refactor tabular analysis of benchmarks
gianlucadetommaso Sep 13, 2023
3dcf217
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 13, 2023
d1b5b4a
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 18, 2023
b4c161e
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 21, 2023
744dff1
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 21, 2023
a22f97f
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 24, 2023
fffdd76
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 26, 2023
c23d16d
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 26, 2023
1cb2917
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 27, 2023
9c1d07a
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Sep 29, 2023
4b83638
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 10, 2023
610fc37
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 10, 2023
e5b67ba
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 10, 2023
1f03d4e
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 10, 2023
d49ed29
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 11, 2023
8200e42
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 19, 2023
882733b
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 19, 2023
c8ca7e6
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 27, 2023
b1e67fc
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 30, 2023
e6b8c85
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Oct 30, 2023
2197430
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Nov 7, 2023
078e275
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Nov 16, 2023
bd4d94b
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Nov 20, 2023
085d1af
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Nov 20, 2023
61465fc
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Nov 20, 2023
bf263d5
Merge branch 'main' of https://github.com/awslabs/fortuna
gianlucadetommaso Nov 20, 2023
257370d
make changes to hallucination pipeline
gianlucadetommaso Nov 22, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 14 additions & 12 deletions benchmarks/hallucination/mmlu/run.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
import os
import pickle

from datasets import (
get_dataset_config_names,
Expand All @@ -19,9 +18,11 @@
CALIB_FRAC = 0.8

if __name__ == "__main__":
device = "cuda"
model_id = "gpt2"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
model_id = "tiiuae/falcon-7b"
model = AutoModelForCausalLM.from_pretrained(
model_id, device_map="auto", load_in_8bit=True
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_id)

# download and prepare data
Expand Down Expand Up @@ -69,8 +70,10 @@
calib_targets.append(sample["targets"])
else:
test_questions.append(sample["question"])
test_choices.append(sample["choices"])
test_targets.append(sample["targets"])
# test the first answer for each question
test_choices.append(sample["choices"][0])
test_targets.append(sample["targets"] == 0)
test_targets = np.array(test_targets)

# calibrate
calibrator = HallucinationMulticalibrator(
Expand All @@ -83,8 +86,7 @@
targets=calib_targets,
)

with open("fitted_calibrator.pth", "wb") as filehandler:
pickle.dump(calibrator, filehandler, -1)
calibrator.save(f"fitted_calibrator_{model_id.replace('/', '_')}.pth")

# test
test_probs = calibrator.predict_proba(
Expand All @@ -103,13 +105,13 @@

# measure
mse_before = calibrator.multicalibrator.mean_squared_error(
probs=test_probs, targets=np.array(test_targets)
probs=test_probs, targets=test_targets
)
acc_before = accuracy(test_preds, np.array(test_targets))
acc_before = accuracy(test_preds, test_targets)
mse_after = calibrator.multicalibrator.mean_squared_error(
probs=calib_test_probs, targets=np.array(test_targets)
probs=calib_test_probs, targets=test_targets
)
acc_after = accuracy(calib_test_preds, np.array(test_targets))
acc_after = accuracy(calib_test_preds, test_targets)

print(f"MSE before calibration: {round(float(mse_before), 4)}.")
print(f"Accuracy before calibration: {round(float(acc_before), 4)}.")
Expand Down
57 changes: 34 additions & 23 deletions fortuna/hallucination/base.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import logging
import pickle
from typing import (
Callable,
Dict,
Expand All @@ -9,12 +10,12 @@
)

import numpy as np
from sklearn.manifold import locally_linear_embedding
from sklearn.mixture import GaussianMixture
import torch
from torch import nn
from tqdm import tqdm
from transformers import PreTrainedTokenizer
import umap.umap_ as umap

from fortuna.conformal import BinaryClassificationMulticalibrator
from fortuna.hallucination.grouping.clustering.base import GroupingModel
Expand All @@ -26,7 +27,7 @@ def __init__(
self,
generative_model: nn.Module,
tokenizer: PreTrainedTokenizer,
embedding_reduction_fn: Optional[Callable[[np.ndarray], np.ndarray]] = None,
embedding_reduction_model: Optional = None,
clustering_models: Optional[List] = None,
scoring_fn: Optional[
Callable[[torch.Tensor, torch.Tensor, int], torch.Tensor]
Expand All @@ -49,8 +50,8 @@ def __init__(
A generative model.
tokenizer: PreTrainedTokenizer
A tokenizer.
embedding_reduction_fn: Optional[Callable[[np.ndarray], np.ndarray]]
A function aimed at reducing the embedding dimensionality.
embedding_reduction_model: Optional
An embedding reduction model.
clustering_models: Optional[List]
A list of clustering models.
scoring_fn: Optional[Callable[[torch.Tensor, torch.Tensor, int], torch.Tensor]]
Expand All @@ -61,8 +62,8 @@ def __init__(
if self.tokenizer.pad_token is None:
self.tokenizer.pad_token = self.tokenizer.eos_token
logging.info("`tokenizer.pad_token` is None. Set to `tokenizer.eos_token`.")
self.embedding_reduction_fn = (
embedding_reduction_fn or locally_linear_embedding_fn
self.embedding_reduction_model = embedding_reduction_model or umap.UMAP(
n_neighbors=20
)
self.scoring_fn = scoring_fn or inv_perplexity
self.clustering_models = clustering_models or [
Expand Down Expand Up @@ -124,7 +125,7 @@ def fit(
else:
targets = np.array(targets)

embeddings = self.embedding_reduction_fn(embeddings)
embeddings = self.embedding_reduction_model.fit_transform(embeddings)
embeddings = np.concatenate((embeddings, scores[:, None]), axis=1)

self.grouping_model = GroupingModel()
Expand All @@ -147,7 +148,7 @@ def fit(

def predict_proba(
self,
texts: Union[List[str], List[List[str]]],
texts: List[str],
contexts: List[str],
calibrate: bool = True,
) -> np.ndarray:
Expand All @@ -156,7 +157,7 @@ def predict_proba(
Parameters
----------
texts: Union[List[str], List[List[str]]]
texts: List[str]
The texts to fit.
This may either be a list of strings (e.g. a list of single answers),
or a list of lists of strings (e.g. a list of multi-choice answers).
Expand All @@ -176,14 +177,14 @@ def predict_proba(
(
scores,
embeddings,
which_choices,
_,
) = self._compute_scores_embeddings_which_choices(
texts=texts, contexts=contexts
)
if not calibrate:
return scores

embeddings = self.embedding_reduction_fn(embeddings)
embeddings = self.embedding_reduction_model.transform(embeddings)
embeddings = np.concatenate((embeddings, scores[:, None]), axis=1)

group_scores = self.grouping_model.predict_proba(
Expand All @@ -195,7 +196,7 @@ def predict_proba(

def predict(
self,
texts: Union[List[str], List[List[str]]],
texts: List[str],
contexts: List[str],
calibrate: bool = True,
probs: Optional[np.ndarray] = None,
Expand All @@ -206,7 +207,7 @@ def predict(
Parameters
----------
texts: Union[List[str], List[List[str]]]
texts: List[str],
The texts to fit.
This may either be a list of strings (e.g. a list of single answers),
or a list of lists of strings (e.g. a list of multi-choice answers).
Expand Down Expand Up @@ -253,7 +254,7 @@ def _compute_scores_embeddings_which_choices(
embeddings.append(_embeddings[which_choice, None])
elif isinstance(text, str):
embeddings.append(_embeddings)
scores.append(_scores)
scores.append(_scores[0])

return (
np.array(scores),
Expand All @@ -278,16 +279,26 @@ def _get_logits_scores(
with torch.no_grad():
_logits = self.generative_model(**inputs).logits

_scores = self.scoring_fn(
logits=_logits,
labels=inputs["input_ids"],
init_pos=len(context_inputs),
)
_scores = self.scoring_fn(
logits=_logits,
labels=inputs["input_ids"],
init_pos=len(context_inputs),
)

return _logits.cpu().numpy(), _scores.cpu().numpy()

def save(self, path):
state = dict(
embedding_reduction_model=self.embedding_reduction_model,
grouping_model=self.grouping_model,
multicalibrator=self.multicalibrator,
_quantiles=self._quantiles,
)

with open(path, "wb") as filehandler:
pickle.dump(state, filehandler, -1)

def locally_linear_embedding_fn(x: np.ndarray) -> np.ndarray:
return locally_linear_embedding(
x, n_neighbors=300, n_components=200, method="modified"
)[0]
def load(self, path):
state = pickle.load(open(path, "rb"))
for k, v in state.items():
setattr(self, k, v)
115 changes: 114 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,15 @@ boto3 = {version = "^1.26.145", optional = true}
hydra-core = {version = "^1.3.2", optional = true}
torch = {version = "^2.1.0", optional = true}
scikit-learn = {version = "^1.3.2", optional = true}
umap-learn = {version = "^0.5.5", optional = true}

[tool.poetry.extras]
docs = ["Sphinx", "sphinx-autodoc-typehints", "pydata-sphinx-theme", "nbsphinx", "nbsphinx-link",
"sphinx-gallery", "ipython", "pandas", "tensorflow-datasets", "xlrd", "openpyxl", "yfinance", 'tabulate', 'pandoc']
notebooks = ["jupyter"]
transformers = ["transformers", "datasets"]
sagemaker = ["boto3", "hydra-core", "sagemaker", "sagemaker-utils"]
hallucination = ["torch", "transformers", "datasets", "scikit-learn"]
hallucination = ["torch", "transformers", "datasets", "scikit-learn", "umap-learn"]

[tool.poetry.group.dev.dependencies]
traitlets = "^5.5.0"
Expand Down
Loading