Skip to content


Repository files navigation

Analyzing the DTRS Datasets

This set of Jupyter Notebooks is set up to analyze the DTRS datasets.


Before starting, though, please check if you have the following software. If not, please follow the instructions below to install them.

Python 3.8

This should work on Python 3.6 or older, but if you do not have Python installed on your system, you are better off installing version 3.8. Assuming you are on a Mac, download this installer here. Once the installation is done, open Terminal and type the following command:

which python3

...and make sure the response has something about 3.8.x.

Python Package Installer (PIP)

If you installed Python the way above, you should have access to pip directly. To check, you can type the command

which pip3

If you do not have PIP installed

If you find out that pip is not installed, you should download the file using the following commands:

curl -o

Assuming you are still in the same directory, you can then type


Python Libraries

We use a number of Python libraries for this work. Please check if you have them all, or install them using the commands provided below:


For numerical computing (matrices/vectors etc.)

pip3 install numpy


Data manipulation and analysis tool.

pip3 install pandas


Plotting library, works well with Pandas.

pip3 install seaborn


One of the newer and faster NLP libraries.

pip3 install spacy

Download the English language model for Spacy:

python3 -m spacy download en_core_web_sm


To create custom linguistic categories

pip3 install empath


Computational notebook based on Python.

pip3 install jupyter

This Repository

Now that you have all the installation out of the way, you can clone this repository, or download it as a zip file (see link on top).


You should already have access to the datasets separately. Copy the .txt files and paste them into the output folder. If the folder does not exist, create it in the top-level folder that contains the .ipynb files (the Jupyter Notebooks).

Launch the Notebook

If all goes well, you should be able to launch Jupyter using Terminal. Open Terminal, navigate to the cloned/downloaded directory, and type the following:

jupyter notebook

Sometimes there may be issues with the shortcuts (some $PATH variables need to be set up), and Jupyter may not launch. If this happens, try this alternative:

python3 -m jupyter notebook

If all goes well, you should have your default browser automatically open with the notebook. If not, open your default browser and type the following into the address bar:



You do not need to run the notebook titled file_parsing.ipynb. You already have the output of that notebook in your outputs folder. You can go to one of the LIWC or Empath notebooks and run them.


Analysing the DTRS datasets.






No releases published


No packages published