Skip to content

Commit

Permalink
messy wip
Browse files Browse the repository at this point in the history
  • Loading branch information
danasaur committed Apr 5, 2023
1 parent a0d26e5 commit cf25f7b
Show file tree
Hide file tree
Showing 21 changed files with 41,883 additions and 329 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
openai.env
*.json
*.csv
*/.ipynb_checkpoints/
*/.ipynb_checkpoints/
env*
61 changes: 61 additions & 0 deletions notebooks/1_Intro to Bert.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Transfer learning (<-Part 1):\n",
"- What is fine-tuning? \n",
"- Why fine-tune vs starting from scratch? \n",
"- How to choose a base model (high level - Why bert?)\n",
"\n",
"#### Why Bert? (High Level)\n",
"\n",
"#### DistilBert - Why are we fine-tuning this model instead?\n",
"\n",
"DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than bert-base-uncased, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark.\n",
"\n",
"It will be faster for us to train our model on DistilBERT. Best practice to create a model quickly to establish a baseline performance and iteratively add complexity if needed. \n",
"\n",
"#### Intro to Hugging Face\n",
"\n",
"#### Fine tuning steps\n",
"- create a dataset (convert from pandas to a hugging face dataset)\n",
"- tokenize your training data with the same tokenizer used by the base model you are fine-tuning\n",
"- \n",
"\n",
"#### Alternative fine-tuning methods (high level and resources for further learning)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit cf25f7b

Please sign in to comment.