messy wip

sciprog-sfu · Apr 5, 2023 · cf25f7b · cf25f7b
1 parent a0d26e5
commit cf25f7b
Show file tree

Hide file tree

Showing 21 changed files with 41,883 additions and 329 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,5 @@
 openai.env
 *.json
 *.csv
-*/.ipynb_checkpoints/
+*/.ipynb_checkpoints/
+env*
diff --git a/notebooks/1_Intro to Bert.ipynb b/notebooks/1_Intro to Bert.ipynb
@@ -0,0 +1,61 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Transfer learning (<-Part 1):\n",
+    "- What is fine-tuning? \n",
+    "- Why fine-tune vs starting from scratch? \n",
+    "- How to choose a base model (high level - Why bert?)\n",
+    "\n",
+    "#### Why Bert? (High Level)\n",
+    "\n",
+    "#### DistilBert - Why are we fine-tuning this model instead?\n",
+    "\n",
+    "DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than bert-base-uncased, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark.\n",
+    "\n",
+    "It will be faster for us to train our model on DistilBERT. Best practice to create a model quickly to establish a baseline performance and iteratively add complexity if needed.  \n",
+    "\n",
+    "#### Intro to Hugging Face\n",
+    "\n",
+    "#### Fine tuning steps\n",
+    "- create a dataset (convert from pandas to a hugging face dataset)\n",
+    "- tokenize your training data with the same tokenizer used by the base model you are fine-tuning\n",
+    "- \n",
+    "\n",
+    "#### Alternative fine-tuning methods (high level and resources for further learning)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.6"
+  },
+  "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}