Team members :
-
Stephanie Mori 40046039
-
Rama Alrifai 40116096
Github link : https://github.com/StephanieMori/COMP472MP3
note : some of the files on github are not part of the final submission as these files were what we used throughout development but are not part of the final product
About the code
- Make sure to downgrade to python 3.8 because the gensim documentation asked for it, and the assignment also asked for it.
- The code is run simply by running it, everything is done in one shot therefore expect is to take a couple minutes to complete since every corpora being used takes a couple of minutes... It is normal, do not quit it before it is done
- Output files are open in append mode, this means that any time you run the code the outputs will be added to the file after what is already there. You need to delete the files that exist before running it so that it is recreated from scratch.
- Datasets were not required to be included, but for simplicity purposes and to make sure that everything runs as expected, the datafiles are included in the submission. The code expects the data files to be in the same location as the code file.
Otherwise there is nothing special or specific about running the code.
All output files are named as requested in the assignment description therefore should be self explanatory.