Code for "What do writing features tell us about AI papers?"
Requirements
- For the grammar error extractor (GECTOR) part:
transformers 2.x
- For the RST extractor part: the
feng-hirst-rst-parser
- For the remaining part:
transformers 3.x
To extract the features
Please refer to the scripts in scripts/20200830_feature_extractor/script_extract_features.sh
.
A downloadable version of the extracted features is at url (features_v2_with_venue.csv, 1.0GB).
First use the regex matcher in notebooks/20200826_journals_count.ipynb
. This exports to df_ai.csv
For each venue name, we gave a human label as either C or W. This results in df_ai_labeled.csv
. This file is downloadable at url (df_ai_labeled.csv, 618.4KB).