- Latest Draft: September 12, 2023: AAC Google Doc
This file contains a list of working links, probably more like a GIST:
- Code to fine-tune xlsr model:
- Code to add LM: LM Google Colab
- Voice collector code walk:
- Recording by Katelyn Eng: Recording
- Prompts for VoiceCollector: Voice Collector Google Doc
- KenLM Language Model Integration with Torgo dataset (fine-tuned from xlsr model):
A short 30 minutes course on LinkedIn: Tutorial
Cluster website: Website Link
- Need to request access to cluster. Only NEU students or faculty can access the cluster.
- It can be done by following the link: Access Link
-
Terminal
: ssh into the cluster using NEU username and password. Not recommended. Too tedious. -
Remote connection
: This is specific for VSCode IDE. Our team installed the remote server extension on VSCode and connected it using the NEU username and password. It is better than using the Terminal but still very slow. -
Accessing Open OnDemand
: RECOMMENDED! Follow the link: Access Link
Use your NEU username and password to get access into the cluster. So far, our experiments have been done using the JupyterLab Notebook. Following are the steps to start a notebook:
- Once logged in, go to
Interactive Apps
dropdown menu - Choose the application you wish to use, in our case:
JupyterLab Notebook
- Setup the environment:
- Work directory: /work/van-speech-nlp/
- partition: used gpu for our experiments, recommend all gpu's except for K40
- CPUs: recommended between 2-4
- Memory: preset at 4GB
- Time: preset at 4 hours (can request upto 8 hours)
- Need a reservation/request permission if trying to use multiple gpus or requesting more than 8 hours on the cluster: Reservation Request
- Can provide path to custom anaconda installed or use one of the modules on cluster, our experiments had the following custom environment: /work/van-speech-nlp/condaENV/bin
- Click launch and wait in queue to be provided access to the cluster