Making a bridge between NLP models and Brain data
This repository includes codes for:
- Comparing and evaluating representational spaces (e.g. internal states of language encoding models) using RSA and ReStA.
- Using internal states of a pretrained language model to predict brain activations.
- tensorflow
- tensorflow_hub
- numpy
- sklearn
- spacy
- https://github.com/samiraabnar/GoogleLM1b.git
- https://github.com/google-research/bert
Path to the raw brain data, e.g.: '/Users/samiraabnar/Codes/Data/harrypotter/'
Spacy model file for tokenization: python -m spacy download en_core_web_lg
python encode_stimuli_in_context.py
python run_experiment.py
python rsa/compute_rep_sim.py
To create story_features.npy you can use the scripts in this colab notebook
If you use our code, please consider citing our paper:
@inproceedings{abnar-etal-2019-blackbox,
title = "Blackbox Meets Blackbox: Representational Similarity {\&} Stability Analysis of Neural Language Models and Brains",
author = "Abnar, Samira and
Beinborn, Lisa and
Choenni, Rochelle and
Zuidema, Willem",
booktitle = "Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP",
year = "2019",
url = "https://www.aclweb.org/anthology/W19-4820",
}